diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/README.md b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_config.json b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_model.safetensors b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bb32ed3f5cec4d9f305d39c8f0b4f5e4099fe5d7 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0c715b243a6384c6071c1ad514f550e439d8f08d13f34da020f30cee96838f24 +size 67144544 diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/training_args.bin b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/README.md b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_config.json b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_model.safetensors b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e006571f24a9d4a7afff9af7a91ed568a9892de3 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:204894f337f4262e6ea940df81e3c9da95f98b3eae0797156e49cb8e9739cd31 +size 67144544 diff --git a/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/training_args.bin b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/model_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2dc95f4204da350318d8af17a5d9ba5f310f50a5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e88bb97654c70dbea52273fe435cfb103e74ae375d545a5e3a0fb6a71c490571 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..73800d2507a3245a67f149e4f0d0128b2d11f191 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c2dd2c557e038202b5a59aadd597b4907fec24c03d73f7b92869d1db8414dbe8 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d0cb160fc6752dc0470bb88b1ba16dca7ed969ca --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd418aa175a4f9508778329e5c11f54241882ad7316c344103bc3804e613599f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c7c7ba5a5d73c30d2e2dfccf92552709b61b1a0f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4f7e5b3f15e6248eb69742a14f905c700ecf357f80b4e2f91b8b83b2a38d15e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..6665d2c680f9d5dfa6e309ebaff698ec28c25382 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/trainer_state.json @@ -0,0 +1,48 @@ +{ + "best_metric": 1.9985876083374023, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10", + "epoch": 0.13333333333333333, + "eval_steps": 10, + "global_step": 10, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1638607198617600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a3e6bf99060efb836531b2f79bda80c3c17afb2b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d20aa253952104483c7e342f53400c0726a91e27ea99b1b2f174439e7ff18034 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bf3a71597e2eea2c3bee602cdb4ae4ef5c933ab0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b1377e5d0dd6af7db06b8176b8910052f06364d4306106d0ce0a58876af1acab +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e6cdf36295b4d559507cf0b068680edea3de3a81 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:46513e9b1de488f3d70a4461303e6b827989f588807354e14d010b7ee4f4679f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4f1a24bb7d4e46bd15c0b55412cc8ba9b9556c35 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd2ccdaca083e589c09bcd97757fde390a191ed5c643ace13a70b750fd4a4e4b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f4bf7370913c48df100e31cbc85b35b2dc4735f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/trainer_state.json @@ -0,0 +1,183 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.3333333333333333, + "eval_steps": 10, + "global_step": 100, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.6386071986176e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..37bfd95922c9953d5e3954d2ab0d70cbe8d19ebe --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5c27f5b2d5f3041c7a4793cfec902822c4f711293b3d9a8ae8c646f8fbbb525b +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5eae0e1328ce786993b12b089aee31706b76c31f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bfca437de3e5ab5c8453fbe568caf086db8951496c90e46d7aade577df63224a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e8b03e39b0cf81b4b723b9421b9fca8f87c7b414 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:319884e2d6c1fad0795ced8add37e8073910c77073120da512a5e6a1f6208d62 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..24dce4e18218617e13af9f93046f397a711717c2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe938637817d41932e7175fe8d9bcdaa1f1383328b73e4b56a4e373476a295ba +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..01fd771e0b55df7257a47f6f340dad56b355a981 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/trainer_state.json @@ -0,0 +1,198 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.4666666666666668, + "eval_steps": 10, + "global_step": 110, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.80246791847936e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..42585f8ec965b11f928a520617c6705ea0365816 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2d8dea0f7ed59106474f85fd8338ca1f7f2f0b6e934b8f36c521e37eb468b099 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..aeac4ef8e41c984cd4c6ca6a28d40ff142916008 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:70a1f987cb655be5552c88bccd6678d37d6fba567532432dc19341a56b89c921 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..71b7a5227226dcaeadffec096acbc7df0f632989 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3500ac793bd5f15c49da717801f854f9815260499ab4bc16b8f3a1ca9c82dfdf +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9eb62bb8ba22966a1e254979e1d2479886d174dd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:145815d6a6480fb85323e9a0f9a98f3e8faa57003487fcac0be85abbf27b4575 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..017f57312335417233f4208ea404ee1eea108694 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/trainer_state.json @@ -0,0 +1,213 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.6, + "eval_steps": 10, + "global_step": 120, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.96632863834112e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2420fe6c33796df8bcdeb7585f92f3a8dca6a580 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8cd79efec7ff9bb3dd33d6edcfd911e06f586aa96032295a709594425ea4adc3 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a529c5138ab2ecfe40082840742a0cc9f8589ec5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5033ed59036560e88cf308b31fa2d5e8e394336bdb95bc28eda95eb534a5ff0 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b60cc4cb8217ae694c7a8efef0eb0b676d897e83 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:602f503f7cd2e84c0b6719714b66d34e98b340f44b02ba8ffc44df096e786100 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0dae8e46aca4beacf0c154c37d71abe175363a25 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:abdc7730bfbf0869132cbbd456c580122a20a540399e30640d4e51daf6f379d3 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..27df437872f42bc4b2679efba247ca80b423801a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/trainer_state.json @@ -0,0 +1,228 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.7333333333333334, + "eval_steps": 10, + "global_step": 130, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.13018935820288e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..98231917493d37978c346185975d46256217d4d6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b488cbe5455b0cb373c6db6b0c4d19a592d7c897f93d6bdc448776ddd27e1d3a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e32e04d3e784562afdaf4def1467d79eb8c04223 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bed00195633f22ad99eb0e4814132d1882ecc7e6d6908c147c9cf74f158969cb +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d05f19f3c7e1e4b728f62f56852d18785b6ab4d0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03c218af617af689aa7eff2d02ae91fb859e96fcb9571b641c5e95247f137dda +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b77086a6cbb29f3cd0e1ac947f6c71c390b2dff3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:21a6935970b037ba9fc4b9dc75dbda421fb162f0fa5b7d5502a5e9660c005897 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..16ee6dc97a7bbdab1dd4fdfbdd9b56be10a5058f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/trainer_state.json @@ -0,0 +1,243 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.8666666666666667, + "eval_steps": 10, + "global_step": 140, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.29405007806464e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..683deb15595d559b7ca984718dfd1c1f2b4d6a61 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:93c187225c756005e3ace9eabce0df25f0e3b399c75245bb8cd7189080159914 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f53eac617dc311a52e70bfd349d11d7beaf515e9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9c9ec3780e6469836d5f0a389869b2fc9abbd67d0f07becc450ccb4f89a01b19 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..61dde1ed8b180510bbda84f0c71356862600ad55 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bdf2188bfe5b1127367f0a0d0628c845d9f54239950b10ed26be9372dba68d0b +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e9d3263bcfa5d62a56c74c931026d6e1762a1781 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d75316f47d5ef08dad7230d3c189fb5ad736372bf2da793895c59a4ccba811f +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b8acc6fb4c2d6c1fa23fd721981c071240397acc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/trainer_state.json @@ -0,0 +1,258 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.0, + "eval_steps": 10, + "global_step": 150, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.4579107979264e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ec46612edf99de2d3d04a32208d5e2df2576302e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5513c2d5e6b3b024630caefc10a88e6ebc9ea3001e1b56884eb21958cc0a05d7 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..79304e6c643c26e9c046007cee904e59a7604400 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63ceb2499afe8e71393cc4d0e2d508219a1dd2b5f383a68e15611e3a35758d78 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..564fc6da8e7c6b2c0f5b62f1f2e55b96ec29c066 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0f1a4ff62819275ae908067e10e49db3630270d7e753db72e5d286184508926f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..333a8435179bb1a27e74cf71169524425347df64 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c60f731d4cb1d489de80d48b0d2bf2049ddfec30c083dac3c65e6fc26b9708e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..54837c0ff8a821d455d1c892c9e1558f334c8f69 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/trainer_state.json @@ -0,0 +1,273 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.1333333333333333, + "eval_steps": 10, + "global_step": 160, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.62177151778816e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5fcaa0dbec8bd9e362bb9a1d3626b3f2a706118a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8bda717cc4b786a5382f4a89ddf1896e1923c0cf94e03d1033f6acb139812603 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a88ea5b1545619d818938501936add2b34465115 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f7dd8f738f6d493a2b94e56b6eb461c1eb1db5ab9710a2439a01856653224bdb +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c13cd397e2cbe97d2fb9e944d382c58418c6b136 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:964f6178720317ac51eb375c889b2d86c7184aa024caf52b59339853ffae03ca +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..98392616735ef4e842735f8fdb0443dd62c47cc3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f8316c64c3f1dcba9f5f78f5461a5450278d6310afba0a2471aa470b51e14fa +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b814358b707fb76fb429e946c5a52a6a810a7ae2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/trainer_state.json @@ -0,0 +1,288 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.2666666666666666, + "eval_steps": 10, + "global_step": 170, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.78563223764992e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..172ba4fb21c5f88e6d5b54fc8568f48fd6bfc18a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c7c4a21fb8a3970a228eb581eaf4ab6d0ac603b456fc6d12c2ca09ff8d81f6cf +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bd0c55957f091467c7dc0dbe499ccbf35da302ac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:75baae09b29d6a1922d0e6a2444cd8f5df40593f096ca40bbbf6482280066d1f +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fdca3aeb31ce5b4aeb2c0f2ba53e3e43b6334331 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b79baa0842c2916b082cba36f9f2b958210e6d7c1813742841fb908cae57fbd +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4c07d6d39c8000e4887811925b35913c0d0fb9e7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5c510c48cbd7d4a31b049b9ce577d9a61337bf5b3120da8df24159e22a5b61b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..53eb446f7e58fef7d6bd73fab68298ff562b7cd7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/trainer_state.json @@ -0,0 +1,303 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.4, + "eval_steps": 10, + "global_step": 180, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.94949295751168e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..44eddbff0afdfc6a2b3938688e217054fcba0538 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2c9fa12c189839470c4c7dd5c3a23418db8ba8e942027193ea3c4c593433e65c +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..24427b522c539c790efcda5df4d9fb98798be8ab --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:45f1df84a83ec7ce238f5306ae0dc294f90f3ad0002b0957c2cb318fccc6ee11 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae44ad6727cf9b3af903ea84902fa6c7f13a5a95 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7d6f4346bdc8a12fcc48535a6002ac46345e4ce1e14bb1f7e9dc3b0ea920641c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c8d96687f829fbdebf86c73104630c11643191e8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:56d80eccd9a2998f395870ad7a48e8df26a0ef5fdd75c8bb18466e506b523f6a +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d852e57d447e1de647ae81bd1fd35fc8e03890bd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/trainer_state.json @@ -0,0 +1,318 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.533333333333333, + "eval_steps": 10, + "global_step": 190, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.11335367737344e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..17dd8818abd5cb4497715d5e3caa821e54548e58 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:73abe7ce42b3f0819b25a64ac1b88f9b4e4865793277b75a0c863dd69380f20f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2358d4fe23687123a402502d31c2387acfcd45bd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f1c8acdf44ac93555b357c6cf3b9d6cd8a1fb2a324a81b53493305c58655576 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fe515b4492af517bd45c5a5c7abbba2b94c5ae37 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5087ba42b4dd9dc68875c89890b692068c71de7009ff67cb7d8492bce11049 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..61e40aef0a507fb8add486ba2535aadaa164b9a7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72a91e63074e9f0fdfc6b1e7414643f389732ccfdfe97b6b3f4c5b0d7a7556a4 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ed4841f70c374ad48b5b8c2b3ea74a55cb1d6de0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/trainer_state.json @@ -0,0 +1,63 @@ +{ + "best_metric": 1.9959219694137573, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20", + "epoch": 0.26666666666666666, + "eval_steps": 10, + "global_step": 20, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3277214397235200.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9e4e8aed27939e5c5d98ee420aba943b6e2d6c46 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:282f92be600998c7fc2a3676e20a24879bdf6ab7e155e1166ee87033e31a501e +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a059456a07b6110c81bd14ef0e96e644f467542e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:128c7f0a6f900f301747e44692db10f155bb6ca18adda4ea509ab16076bcb372 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da263858f32b7536e68a33626ef41e3ef7a44689 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dbbe288070e588c7effbe11249d330a3ad16131211e6b5dff1d03a8ebc7517f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0fc1f2bea0ca1c9908bf307e3525efa76fd70425 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5b7dec72c2b7f015512ea839980ec16d0582c7e6d0689dad8794261e73838b6 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a64973d5ed6635698a7a6589dd3632da43c13003 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/trainer_state.json @@ -0,0 +1,333 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.6666666666666665, + "eval_steps": 10, + "global_step": 200, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.2772143972352e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a744f3db4f9553f2e142c58ed0c6f6a2382f1762 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d64fc8e4185de40cd234ee0e5fd8b8e9faf1be06b0a42c02f7f1fc15d49cc51e +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..596d87b0fe446b6707ed09e70399e3baa88fdbdd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f01167a8d72e463410ce6762dd08cb43ef9530f0762fde80ecc2ca7bf0a4a51d +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..605214081e6b3060d6c3e526fc86e8b8fff3c71b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd4e0019fadc179e2ea531ff33d86db759cb80e64a8826bb6bfa90c2483bfc04 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3b44a2b0d3df617f15242e2d4ea4d5553b544573 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c59d7cb173602f981a42f5fe61d72e03c87c9f97f456afe9fd66cd09957f177 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..74fef7ba0065e620b866ee7ace897174d5c2d1f0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/trainer_state.json @@ -0,0 +1,348 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.8, + "eval_steps": 10, + "global_step": 210, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.44107511709696e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4e57b4418c7e2be3ff61090a213975654d888755 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:abe87d39b7318d8e727733cb9456d545f58f582980cc7cf20a3e02e7b9eacea0 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d9b55baf41f193ea35df2a8968cc46e9ef60792c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2906be325444d50e6ba9599f5bf2457aa48c764131023afbdf441039cf88336a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..823c878e3ad7d7799e1959fba97c90aaf79af4f9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5e4256f7b7ace2dd6194570c191ab9026456dc0db24025edac4a5bd9e379dab +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..63906caa8cd7e3fc0686b7d0276e496942ef0036 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:daeceb22ea0c54e6923c8a042a9cfc5a5bc826f201c52f29454b62c289d49dc6 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..891178a7631a7594c3cbe799f6b605297bdfcae0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/trainer_state.json @@ -0,0 +1,363 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.9333333333333336, + "eval_steps": 10, + "global_step": 220, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.60493583695872e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..7040bacb0acc3f41967c1a4bfa31c05206c260dc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bb76393eaa7f21bf3bc154904b3d5b0b8ff429b802b25173db15a42db59ca71 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..936daf7139067f0b57267b704812aa3da49af21d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0e4da7cef988bd3881a3505fd65089df769e62945af3e8f18b06fd1a5b7955a0 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae85ad205796b2c3955218eb7b4b348ca35978c7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e2b38199e26ee1965ef79aea019c0217039e7dab109a4b6e29c57f1bea63d6d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3e92e5593d8d7139e837b2a75209a41c074c2e8c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7f755a0bef74517fb45fc39d7689eaec499187cc5cd60002751078b0276b353 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4c353597a8a99d77bbd1ef5818944c2ed0a0059d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/trainer_state.json @@ -0,0 +1,378 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.066666666666667, + "eval_steps": 10, + "global_step": 230, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.76879655682048e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..3e41b973edd657b3e13400ed2bc5685198b262b2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bdb3745ce942156a76b605fb2c2ec3c9aa7b9d8507b7121f2086867ea4b81bc4 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..36dd76d1afc13a4c80d3cd3ef1d26b1114931127 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4a2e5f976d5278939c5044d25e81d89261ee9ac2dfa4b8d0ea48e74083578ccc +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..846c31e0418b3b3196b4e9c5d730a866c947d1d6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:33d7857a6e3603508425c326c1a1dee439799d2c72bbfc8afcabbb8578757780 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..31e05b86275fed970cdeadc24115c84e19feae09 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f73703efe567bf60e5ab219b736abd5d1183aaab558b64454b92f8bc5cf1b3fd +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..20f72329911850bd865831c8eae8f8d83530836b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/trainer_state.json @@ -0,0 +1,393 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.2, + "eval_steps": 10, + "global_step": 240, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.93265727668224e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a17ef5e0b81ed658dfeb1affb3764fbc76fec58f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:69aa0d15ae4184088fcff991f6463ddafe42e0c934b5c6a3663ca1b4b27a8435 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bd01d6a7f365c165263667d4e25b1243dc3dd788 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aaf461e522d728c2e98a72e3762ee25bd1e9480a38a60fe9bd3e884be04013cf +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..90df82c0a610ae490c2592c79d46fe23cde8d351 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5b7a10b9f8de84d4eac8f0b5437669695e0a3ed004e055b39340577de17c55 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..174b5438f88f4c3c799b43c4f559ca991fb938b4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7ef310c01f40cba8e9c44af8332d1cb681a7026399804fa2296ed59c6594e708 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e467f0d3769d4aba9a2f036106c36644bfb0b9b9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/trainer_state.json @@ -0,0 +1,408 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.3333333333333335, + "eval_steps": 10, + "global_step": 250, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.096517996544e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6400491a5abe60e76410a99d77df0412ccf643bc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:de17e0cc95c7d3f7f1cebad9e1b15567369f70f461ff7cdc4b8ff2409ff2ea7a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ff47138e4d6a104fed385c54542004c18c8bca5b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cc40d682da714dd977de052b465377e01b4d2a043e5859ba362ba5c6e5872998 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..293d181974003fee2540af0648cfb4e42786ca56 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:78bbc69e88d5e1fb15138660b4de76d03b9476fa1ab2d16370f894a65eab3da3 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9431a5b0a8e3a7cfc7a6acff3f3ba51f0ea91b16 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e388642b0db2b68dfc847810d17830763a6c1ccd5a0a2c34607435281dfa7f25 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5a9eff3398b6c0d3b2775178b74e8123db8faf80 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/trainer_state.json @@ -0,0 +1,423 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.466666666666667, + "eval_steps": 10, + "global_step": 260, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.26037871640576e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8a18a5d7e6f0f0329a4969e961235373d2cd0108 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d370c100a89ed09651646e46f4520278ae262e088f07c92543cce621bdc64f2 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5c6d7c043cbce4fc745e5f59d28878f6c57c396a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ffa951c0558bcf6896b81df45c7d885258a5125ff483c3d81ecb9d4f117a25e +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ba62c782c818c1b90b0344e262a00bb91255dc87 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2af2c0de08ddef877a4af0e5f2dfe4570d2f029659f125fbfe3bbcce3a8b09e6 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ed00d4e9803635011eee9bbdae275cac04953c1b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9b86f25c4fadc98da61c18896b4c25ab399b3a23b766274b50979d4340358b17 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..655b7898566ab5dc3d523d55c9de8a521be886f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/trainer_state.json @@ -0,0 +1,438 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.6, + "eval_steps": 10, + "global_step": 270, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.42423943626752e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bc7ccb1023f394202504e24557f9ad5987e92557 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c42a40d14ff24376340fcdd9ebf91c73d7c4b6a7bad234a1733db2fb33c5ac84 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..37c7213d6983830b936b049d665e4a39fa7fd71e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14966cf3c98ce13cda8ded784c3d8be0b4d811101fa1245bf6fe7e314ede2238 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1702f62666b39cac633a34cf312f24e311e13df2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ba79aaff190fd3ef9f70dd7c0a234665c2bd6c6bb243b5896c5bd6a16356627 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5e2ffe406e2d87ea70e25bfbdad4187edda05acb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d68cb0fb8d225e623592feefec72ecd0b7071657fb56415f262582b52279a56 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0f5804527842e7d80f3b83da41bacc1a1db06caa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/trainer_state.json @@ -0,0 +1,453 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.7333333333333334, + "eval_steps": 10, + "global_step": 280, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.58810015612928e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..333f7b04880cc98729141d95ac58670700b235a0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8b5fe21c2e039fb7c5ba0df8b0db83652bf2c74283af351326ffd3fc36cd6120 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b1fefbc8a62f7616364f6905965635bad98312c8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a4d5097cd378b9f6184fe0a5bedc44a7514224e1c698d7855c77018d06f92459 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fecfedbf1488a31afeaf7c01dc4f9760cfff1b16 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47c6345b8afbd1f7a687e942ce33ce022660a29cb46a23e4c9eda9e498053741 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1da17016e7f80351316298af3ab35d6cc666d60f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:518f59f6861d3d54674180d781456c4d55d82eb1d5543c592846efd5b6bea3ea +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..9f121edf2794ff3cc116baacb3bbcfcc2e914304 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/trainer_state.json @@ -0,0 +1,468 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.8666666666666667, + "eval_steps": 10, + "global_step": 290, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.75196087599104e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5e6cb55fcbcc3c1b5babdb9a500298cb3909ff80 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9faf9f28944ffc8cf1962154b2e31b8dc2ef1d2c7bfd180e47d1bee04b826365 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..8defc59b8d1fabc498befe65c3008008fbd82531 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b16ab48b034ffa107e9f0755caa33f7ab7e12bb23b399684e57252971e428fd4 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..76ee62462f7b8b87edaf24539d12d81995c70164 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3a5478e4e53ebdf948038ed344f6e976416991ec94630cb094a18d5adf7aae7a +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8e3204abc81bf616d4220ccab7f0f13520ce949e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:19debbf018dbf40b240b0a2ef65d5d10de2fa92e61c8838b0319c8c96ad962cd +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3e5bc5a343ce0d94f608c2338ddaaf36573839e6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/trainer_state.json @@ -0,0 +1,78 @@ +{ + "best_metric": 1.99397611618042, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30", + "epoch": 0.4, + "eval_steps": 10, + "global_step": 30, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4915821595852800.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..938e5206b0f9b1295e1908a4f470f5d8bd89feb6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f137fec47ad85c663eb1ba74d8c333d5bda7a22b4b698594da1702f9ecfef05d +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..02b5adb9cb75a6acf75b89b92b45db4fd3e05e5f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a6e4c2fed6be1e1f72f6c60c30b22bd36e943509f8c7e1bc4709f845fdd54165 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d8ba268ef07796e970a23442889935701a1dda5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2574c6149307e492ef05d2031918a546356cc654f4671c817f05ae6d0764de7f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8640fdd49a163110aee721e1510c7d552b4242d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:30336c219d20749546325363bfa0b5ee5e9d4b073a303024ff3ad347834b8c13 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a7fc2898aff2348262d9f3be2b3c97d4a6379e7a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/trainer_state.json @@ -0,0 +1,483 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.0, + "eval_steps": 10, + "global_step": 300, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.9158215958528e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d213c8c774b0c08364523c8eb295fc880ac975d8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:18c9f6f9299d35654f7bc0d7c1c8f2e91f4e9f402d2f8c1994e0f019aee8e1aa +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9448e39d3e9bdf2ab477e4e391b1eb00e031efd3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:acae09ab5ebdef72bd3667cccc337c9a23cef31cbe57920f630a722d83a3667e +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..a5b4503b006d8dec33c7a086d3d007eef4282144 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a82d768c5f5c231c8b50481a409281b8639e231a185281a7476164488eb6c27f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..58f0265f6abd6b6684c5edee08f03cf244492dc5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f67fc10f846f52b9c0359f08a436d3ebec080f189f60c98def04956b2dc83cd7 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c98b0385af9f7cef1cf414acd163953e8b419eac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/trainer_state.json @@ -0,0 +1,498 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.133333333333334, + "eval_steps": 10, + "global_step": 310, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.07968231571456e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..49ccb120779b1912ac7410c34420821a6d3fb966 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c04f0bc77f8b2b605cdbd372667ca60d92cd81cc00934a67581394770a91ad5c +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..af61fc126a6945d0eb635f2d7f7f74a7bd5c9598 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dc2cd5b7ead815fdcd45ef37ca8e988184ed9a827244207d5f48eb084f54b611 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f5fbaf3739704eea759ab29b4b9eba0fecf79ee6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4f581763059f9808c6971d543bee5e034fff1a9ec174cb7aa232dd9f17099da0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0f02c233f432413573681087f8ebce358efeb676 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae2093149925b534f5c60211635bf0097e5b3bf50dc856b0e3f5b17717e52497 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5b2fba01fe5d3f976df0fc9d53c751d6e97b176b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/trainer_state.json @@ -0,0 +1,513 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.266666666666667, + "eval_steps": 10, + "global_step": 320, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.24354303557632e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e0578c1a1173c7aa583eabff5857b17f73bc7d28 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ad17f1ef67048f89d8373a7c145c260595f11d9c1a9089d9e26c12b15475f0b +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b907043dd5e5c17ccef5058530c6e17a1f943c04 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:21a43c5c39ac7e35dfc1ac2ff2c6d8a282f9b0371ace57f420c33e9c17263c60 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..759bff60bd0897427bf9d4410df520d35fd20081 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:389caf1bb32aae3a751e11d63ffe273f089df59490c4ac6e5883d944b329df0b +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..2df061fe83adad240544d1899eb2e5e2fb23a555 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:22b896fe763bef96dfe0d570de4fea5d935b3bf80de3a9b1b2918efca334b093 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..78fe10708b95a1977c0a60c41a8ea0c26d1a1dc6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/trainer_state.json @@ -0,0 +1,528 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.4, + "eval_steps": 10, + "global_step": 330, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.40740375543808e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8e6a4c5d09f750dced85d363235b9296b22c3c1f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39f2510fca7405f10c16acddb805f08b6d0e5d7660068845eee9a4de1a5a39e4 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a81ceaab343015b1e2449e72f836805e402df3ac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7f2ca7e7cff24d1db961cd3b3771b0812a0c167f661dadd248886cacaf8f8760 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d7fc830aabf2c4827b0609ed6e355d0fa80523b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b904f845552beb994fcd34362e728f918c7473ac27288d463195b51c3ed73bff +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..28521d181a67af05811165bf7cec3a0fcb49ae9d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d03bb25f48f188323d4c5dda872d760e309dedbed641397ec2ec756835c29ac5 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b5f60e805a59636596342a601d11738a95198cea --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/trainer_state.json @@ -0,0 +1,543 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.533333333333333, + "eval_steps": 10, + "global_step": 340, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.57126447529984e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4af4a0f29c87d05943497967f1505fa6af427c72 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0630788b0ab67f2fb3d5ac5748f13d83984578f3c3fc6a5749e0fe2ce03105bc +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..8dc7cb9bce8a67c730ae7f5d8693e877e1834cc2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7d0c2ea8cab91876c5c0e71c89ce759e297e814b212e60cb56ff868c7f779fd +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fc3bb37d365dcd8ae3528d8e7242f7d2eae755b3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39cd0c0a4049d541d90e7c6154cb21167a341830884ad3558195617942678446 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3fc1a8fc07398191149b701be855b2b30b04d498 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8b8cfefd46d2412b7b17da7d799f9e9021312d0b294976f3e87f7063aa01557b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..df867422ddc9968a3846389b1f92558d06f0de61 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/trainer_state.json @@ -0,0 +1,558 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.666666666666667, + "eval_steps": 10, + "global_step": 350, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.7351251951616e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e9d60a7113965cee5e477da7a35909036eee56c1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9e486442b6dabd2ad3f18cb64640df24f0b970ba41f1f51a52b863735049b30f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..df267413de36b39194926787fbbaa58c884a9330 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1387563201f730ba75d2500054313175f562dc0715982ab1941379f72982bf8d +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..dff7e422d3f8fc71ea77fa33b28878ffbe8abd43 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d73d43b628bfbe3f56e29099c04e9e9584349f935d8148aa8c34849bf03ef49 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..fc7191e0d24d86be98ffef99b67fe56b52160821 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6cb47a43082c3958508d73a1bd58f111764a18725005ed6a37a8d99585cef386 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f2b2623ed6abf8ae949b34ce648f5ec73e7f186b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/trainer_state.json @@ -0,0 +1,573 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.8, + "eval_steps": 10, + "global_step": 360, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.89898591502336e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5bf231698e82de29732ef5af7168dc3f4ec3abba --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd95b1f82959b9e4e1607acb0f58e72a3337f6112ae931367fb5683be18bf309 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ffd17fc6c26e630ca941b21f316fdb2b77e8baa1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1917f5b34018cde496c79ab99feb5cd56fa728b7f534493b4dd712b32ad5ee14 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..792417d4c800bc4c8f7eb21d5421678309a6165b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7c0e313f3d6f9e1adc7603b9ffa6f0ab3438f71ce0c71bd9a788485d02b981c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8bd8e28f2af2a751646ea36889854c5eda0b2292 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1511804f46c0ca65fb38b3cc2eecf2ff9872408b4f80615834923e731745685a +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..791da0c1c43c81858f3a7249edfab8218ef7c77f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/trainer_state.json @@ -0,0 +1,588 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.933333333333334, + "eval_steps": 10, + "global_step": 370, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.06284663488512e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..67b97a275bb2fa335a6726f113885a7781ceeb0c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:17a4738fa507205391fcefe4a4542f938b43d1b3638af30dd03dca30b92fa86a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..35bd01ded58b6d1e3cb16470652d2b8161e78c65 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee462939192d94ffbd62a9731f8f72711567278d94563f86170077e8978a0983 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f3b952e81c9ed8c37528c0b9d4c13811ac0b62d3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5ce5744fa32738c65fe7785ec589c49d96370233c9386567c3f06dceedb5f2c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d9c94a2d554cb9176e1f6452c1c8064e701f6c9c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:69cb8b9fe313cb48c89565a287ca91c45004877815ee7660be6b701d2464119a +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f3fdee4bda4767acf938b126184621eae7578b27 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/trainer_state.json @@ -0,0 +1,603 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.066666666666666, + "eval_steps": 10, + "global_step": 380, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.22670735474688e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..83a94016c29382d583133e4e259f4ffddd749a83 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1d84f39dafe5ff2a898e73d866a28f198a7183ff72a6b1a69564387688a41236 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1062c343592092f7758c7d5c8e288bf8d8171bc7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d2ecf6616146dfb8b2b429f645d89b85c2af2361bff31577b94e8239511b9c8c +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b458d8885e612e71d79c420d6ca3a40dcdcf7fd8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f47a6a8940dea009f3b7ce239248233dd458275df17acc4fa8ff99eb346e8979 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..bf1961280088992857ddb8fe8d4584423c44edf6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6bf472a6dc646995e9eb3a1b728ed47b4f764790f096bc535722b440312b4b49 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8c7e6cd868eba706ee4f42ac218944fc149dddc5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/trainer_state.json @@ -0,0 +1,618 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.2, + "eval_steps": 10, + "global_step": 390, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.39056807460864e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fb56ba9a5e41273821cc419274dc10952b33ca41 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bbd66dc32257b95ef705809cabae5db6aa9841aa8449833924fdc489c0c7ef6 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7ed3765d50a4e1b107bc0422d78ebb2c80a73bf7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6ae6885d15bb28447c20d8bf34873117d953c157a570ca8d1089a967c73b28f3 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..cc0cb9030af17e56f3ab00fc0ad6850b4636069d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b5fde33a4ff115b0a519c0ef179183e0540c837c91cce3dba97312fa8e725570 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1159228ea69439db76026731513cf5c71e57f3eb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f953d62fd365ebab5cb8aad6e7c0cdb075e95f55a4cb36b4f4e0198710f2320 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a63b22b50590dc5e204144b8405822e6c8d0ca60 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/trainer_state.json @@ -0,0 +1,93 @@ +{ + "best_metric": 1.9916906356811523, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40", + "epoch": 0.5333333333333333, + "eval_steps": 10, + "global_step": 40, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6554428794470400.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..7585af44477a0763e851622d7b7315dc1d758275 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:53d226df6e3301465fcbc9bcb233e814eba6a2a9f02545ec4a9052a6b15a55c6 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..71177ed7986f861073de8804aa18af1f70f1670b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a16f948ed0032df6007466123eee1763e032703cf41f9d55c0453e7c0e6e429 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d06e3c475517e0d14c13a6ccad84a3f20110949a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96f529f9856ab8a411ac6b8078e33cfc18c0159c4947cd8cac8e1238fc1754c7 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f91719d8a1b8836b7155587d155c2b2cfc9c7e48 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bbe59d4638e3afc1c337d3e4814ea99d33c22eec7bbc39984af69898855ffb2b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bcfcf246056adf9d10aab89fe0bf42df18c81acd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/trainer_state.json @@ -0,0 +1,633 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.333333333333333, + "eval_steps": 10, + "global_step": 400, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.5544287944704e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..dd082f484ad1c3bd5b882e1d7c37c71ed2044f96 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e199a5952086a91ab9d4e5406aed76ad3ed5e94126b9d7060aa722d976d5f6e0 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ccaef2c54f9d4a1ccfd1fbee690c57b1ad9a994b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b64b8156f013d25f27da1a4277e91e4b84b7c268a2816a135c6f08533ccff0de +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..090a1de878697aa3e6255ed23ff26ce6e561a9fa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2cab01f3c0a9d66cf16eec91d8aebbfd533628e45bdb849b4c3e4ad317f15270 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..390146116c48e62f4426eeb3a1cf7a2ccb90f69b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72422499e547842d9c164e7afacfea53fe3941a7a106527c3755c473fa91c799 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..6c6ae29e9f3499ffae433e6191f84bbff0da9509 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/trainer_state.json @@ -0,0 +1,648 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.466666666666667, + "eval_steps": 10, + "global_step": 410, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.71828951433216e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9aa5027632bd0ec28e5e3302e7584345a266c700 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8d7e6aaaf6c65f2afac5e40a764a02266684454a6cbe08fd11b326d3ab582ec3 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bb7e239ebfc26955bceeb9025546a3dcdb5e50ac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:eab7dac44f2bb74b16ce3042014b4f00c0a870925f7b2efdee39e359fd672909 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..7c168ba589ab149907f65c12980a55da76890995 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02f02c3c7264962c7bbb05c73c2c2f9530a34cf2c29d550cdc787ae19eb6d9bb +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6b66301b7ac8ccf1308c1ac8d63d7000259489d4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5336fb81030d9ecbffa34471d17a4c3981e781c865d7ff7a9b59e360e4230577 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..53e1bea9cde34ad2b668d46992e65f86ee868d90 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/trainer_state.json @@ -0,0 +1,663 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.6, + "eval_steps": 10, + "global_step": 420, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.88215023419392e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fd7566e31b74fbfe9956dd352b86368063f9b3ef --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4252b58851911180b15b5c7eddc556174ddae11ccbccb72df188bcfdc7bc08dc +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0a7c1e6c25f37acc462027862bbc2e38f133c14a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c46f438b9cd6616cf0cff0abb0e04252f50ea3dfcf1fc63a828c7841d98d6ea9 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..eb08c850753d158caff59458c0a4d2fa22ad5de8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f5c1faf0e9eb010c64f51b35236463635709da903fff7194839666558e862b6 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e44e26be68a19106ac45dab84a43a732acb91528 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:579c34af7d7ec0609fbd3479f4f8d8571c4cef90c76d9f6bacf43740f58855d8 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c7bd12b72e74c900b1da1f80b7498b49d58578b7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/trainer_state.json @@ -0,0 +1,678 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.733333333333333, + "eval_steps": 10, + "global_step": 430, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.04601095405568e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4ba195753412a8e38eebf61aaef53861c90ec6eb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:475146c79a6eb4b4ab85f07fde4c4f0e8156ef3f7bd3dd1e05a937d6f91fec33 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..cada2f6aefa15c27c93e27fa6a3a11f733e47940 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7dbc710b07522cba955ddf0c96508f71ab3033316a40ffb0c36a6faae35b57a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..5fdc5e50e381540856fecccc6c375074d1aa7b0a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54abee51bb88479cda4bf77e85c2a545e7fb3c5e42f56d1baa63f1344dcc0529 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f8f2b85f23363ba098112683059a3e46233b6bfc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01a563f529b13f402d286b14bda74d3530e1fcecb2bee786164bfa1339da3729 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..49fe5f81bc60c05a513fbeae208e4348aeec7309 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/trainer_state.json @@ -0,0 +1,693 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.866666666666667, + "eval_steps": 10, + "global_step": 440, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.20987167391744e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6bb7b3c6d6da6b4ee26d3255bcb16f029826feb1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47971879d6e1764635401fbd3d009c4ee8d649dbd4d60b6736c5cc406247d461 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fd3217e63d28c6bc00922f6a166718cdaf1ef9c3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a4d5d7463b0414fcd9c3ab319aef15c9a83db03faec1e0bc3cc18955731a53f +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3e7c44b011328e871a23ca1fea7cc6ea78d70a29 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4cc0a8131f9f14b855b33975c5e795a94be3a332a0f3cf68a9ec3ab6ce73b177 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..671e99d731836dff5ed479ba9e24ab368c795616 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb8360cb66be4e8be27b2f376c800950e3f00449fb6491d6247165f9aff23820 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8616aef7170fbf3d673a3025be1fd2f09a0f5ecf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/trainer_state.json @@ -0,0 +1,708 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.0, + "eval_steps": 10, + "global_step": 450, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.3737323937792e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..7f025722155d2d016b0989b5f6ec526bbd926eeb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b33b7c25925b5c788bdf197bc8c336457003b3843db9b0cafd92866a5e3ae9f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..3611058bd0e611378ae61ea57c59965596c9dc59 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d48d7176fba5659aec998b3e77002b69282e5f1f5ab81edda5d70c12f0617442 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..82f7415495fcd1c3ffb5dae79c8c3a4c2269faa6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a6424cc1a4d391795fbea6a94823363dca21ce0e7ec6c433e8cb5b0aca0060f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..82f0764fa1ca7bd5d0d2c27e699e54f97149a9da --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09ce894ec673ae7c851228a15e2e8a3dfc488203c01cbf434a7c4cbec9b7becb +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..40d7ecfaaaf95cc374c4de6af67661c6a7e4ed5c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/trainer_state.json @@ -0,0 +1,723 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.133333333333334, + "eval_steps": 10, + "global_step": 460, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.53759311364096e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..78efe25f0ce0b1b76d5609a86d7d9f7be61419db --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14924f1a38ec0a2dc3922d964a07cd357a183f462deaeddf2b773753637b7666 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..63e30310488072ab5b632f53bc809bf4db22cfdf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3fd22d0c11c6f4656481ed7bf1dea68ca6a51ddb3f277d0370cf3f3a516e0bcf +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..84ca1f63cf231e2aa1c43b465c46ef11c80bc867 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03fc4a1860f68759a4d7833f4317681e377d4e71cf91ab1f091da8cd71579d26 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9e4c1530ac9944d4b54caf372d4f9930c6597321 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee66a0b6b4d05213664fc79a1ffd83a3bbefdb7154906787c3ef06bfdc4539f5 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f428122b5c1a794f472524c99959396774092f9b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/trainer_state.json @@ -0,0 +1,738 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.266666666666667, + "eval_steps": 10, + "global_step": 470, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.70145383350272e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b9b29a74ff08e66193bc0c02625a57db75bce4d6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:93d7bc50ab3231a50e39149413069234d8850c07bc9eaabc1a96eebd1640e2e9 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..14f0a91aa7509722a845ac4d2f6890ac6f2e1dfa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3456887deae02584de174d3880707c89e2ad09b5d4f9d2ef56112c5239ae2b1b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..302025be6f88ae472170fe5d230ba39d4ec976df --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:918d6ec8ede8d7a880512e2fc44b16d7c22df85e8b411a004d142edcf446c40d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9c7a583c3e236b2f110dd12004cef1d9a2b13311 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:535824b66976a8cd20163034000bf2ae1a203551ed6ea6132858b6421f4024c0 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5c16d21b3e6f13882e02dc393ed6e55a3bb2a91a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/trainer_state.json @@ -0,0 +1,753 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.4, + "eval_steps": 10, + "global_step": 480, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.86531455336448e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4bb8d127919e948010fe3a8aaf0a61c7db27fe80 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d2fccf9531b1a013cbba211d5bf9693bfd9c63691befdab81f480cdb8a3e6ea +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..54b89f371ea0fd821a632eb1b7f7982c837c5867 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e00fdbe7a2976608eae8758bacd9232e1840f133de3426b2851eb455c538719b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..031b265de35950a615eacc2c86e46292f552e541 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b56a3ff26dded8216d560cf73ba4817b5973851b78edbbf6aa9d6b515761df8c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5c509b1230d4d9d9bf05bb1cf38bcd2d3119d2c8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:65f3e63eff29379b2f31d4f746c0c715c2b686bd11d7e07aba3d5f29231a18da +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..550cb422da83e1c4fe58be77b59bbc1298d0a6b6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/trainer_state.json @@ -0,0 +1,768 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.533333333333333, + "eval_steps": 10, + "global_step": 490, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.02917527322624e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4bd52bcfe739712be169a68fb6df25deccf06fcf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:da8457beb421cc761599f5e44483e0715c8c883c06ef7dfa63f1cef39a292dbb +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d0cd9f7b8c54126fdb64520973b50b4bda1829bc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9840210b4229c60b18a2048ce3cf560b2842a5ee51a07137486246a5865fd246 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c1fc54eb4786e9f15244e8e4274b14688b87da5d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7062fa0264c6fb17100531852b46c235ce631a6626d5e19749a65ba8723532c0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cee24f7781db565e483521e84ddc6dd277a07ef3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f79415c3ece613ed89d676bff22f42086790a2bced0de6758824fb8c7e27fcc +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f72631b9afaacfa5110ad2f3e4c95397747ddf93 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/trainer_state.json @@ -0,0 +1,108 @@ +{ + "best_metric": 1.9898818731307983, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50", + "epoch": 0.6666666666666666, + "eval_steps": 10, + "global_step": 50, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8193035993088000.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5e0dd1b116075d03e2794b17335814e24f329e2f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dbbf9652254ae9bbd90f4396a7257ab8a6e5f2a1f344d3569e17039465e179e6 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..4100ec061acc82c74dd281cbc52839fa79209bd8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7f452ed64e20332a1a900a32d7bde476fe65c3db45e9cf88636adc85fb493e57 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96edd96602542afab3935d537c8d1428ce43196b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:beda198a64f1e6f1db0895ff6a6859c2af4c98fbf9c15d1daa4dcca9c20f50be +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..36002f421a8027f0e22e1cea8d6c317eebfd0e2d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e63d56828d52c149ac34c43bdc2adc48c363068c94b9a3df26528670b68d615b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d18bc9bb9372a775ac536bdde3d5fa27135e9e43 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/trainer_state.json @@ -0,0 +1,783 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.666666666666667, + "eval_steps": 10, + "global_step": 500, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.193035993088e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2001974c660af56cc79cbd4888d08e469fbddbca --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4082a32681d52e6a0cdf909ed87b5987aebdff41165451732bcb8f7805773b02 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a92d38b2204008b53ed225a0178023d283e4cda0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f7074f566ffa90a6a95298a280b3dcc49408edf4e291a6dc067eb0fa1fb073a4 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..52b85f2bd42c764f793cd9aa8382577ad1b51617 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:156b16fe2af6b1592b431fe36919ba4914ab9e672f318f884f5045be66654277 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..a5298f7a45852e72ab3264eef95969ac26ee5012 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14a2456b0fb437e597f1bc67f02d12ea64caadba3ce80e5a7bba56290d13a10e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..fdf37e9c711f420c89e2fa398ce1bd4ba20c2488 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/trainer_state.json @@ -0,0 +1,798 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.8, + "eval_steps": 10, + "global_step": 510, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.35689671294976e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fa329de9920bd510476f8962fa956ade43529951 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:429a0974b6d5dd4c20f24de610b50219cc5273a4ad3248a60b885e53e1f55cdf +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b4640328278442b12fea4b24efb0b1e14443e659 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7ea6c594ef6e4a5567b115bdb98718935d31b6fec0863bf83b7daee23624ee00 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..736afdcce42e3e1d5dec3aedeed239bc0b63975c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca29f15bc2264125f00923607dbea007ec921af3e528271a2bb77db5cd4d2b66 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d470aca3bc75a59cd83f65a7641e2227523184b0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54453f7799a2c12a65729e49535ef0d1133252bbba34418ca96403f477d1ed92 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e7e051ad8bc7a163887d4875aa72a452348c80f2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/trainer_state.json @@ -0,0 +1,813 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.933333333333334, + "eval_steps": 10, + "global_step": 520, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.52075743281152e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6015ebbe4ee22ffc1f1fa821f3cdf36bba687c93 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9119dc8c4ef918c7da30e9b82f20717796240e2a18fea30ddfd7d511606b51db +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..69ddb4accd56b6ca9ea13cc77fc19d8e8bb53655 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7d1e020a223c6054537acadc2b00199366b32c6ab639700c6fa3bb8093fa19b1 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b0413aa128dc89fb63c7a74242ac1a6da3ecf5bf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9436217a6dd3838565d7b9845d97ff2e933eb514cc6ac99465ebc3448de3312 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3038016ab1789281fcb7570057f9ac7ff03feda9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67026c5b7b6af0a730215316d61a8dcdd8b26b784be7a50e23105aea365fc01d +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bb2fda9e13acb0a1d3591627c3ce75bc6042f1fb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/trainer_state.json @@ -0,0 +1,828 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.066666666666666, + "eval_steps": 10, + "global_step": 530, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.68461815267328e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0254d14546024feb21530442ec02f23c4a6cacc6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:86c6ee3b9c60739d3e013f79a6b1aee80cab93cf9624dfe638b931283f901b78 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7416144ded8351436d830dd2c5bf7f035c87921b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:76aeb4e35c717e79c2aeee223f073d5a69ac81a4d37bf082d0203f22c6e3c0b8 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..8d48caf21e655a01d7675a2b465c934cea676943 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:816bfad4f86e01da7fe3bd5bf7d10c902cf135a5b5fec9e0170158290fe5828c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4191ea1c11397f76dbbb9677283fd3b541b6e689 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:61e7bf31ab25b6a7b2f0902a2e1f6ca5545ad296580f627246378508da64fa41 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..78a7cee423bf8299623e52bce98aa5b8a59cbf2c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/trainer_state.json @@ -0,0 +1,843 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.2, + "eval_steps": 10, + "global_step": 540, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.84847887253504e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e2571da84c0e2733247a99fa0619b0bafe9a8e5a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b7253e51765c53c756815bc777cecbb38dcf3aa4c9e788ce520f82e6b8a1a2bf +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d3037e21e5629ea46a8ad9c73e340ba182220908 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa1ac052faa10b4652a09e06c1d63351a83a4d4b48a2c003333fcdd30499b03b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..9dc1ec111f2a6f7fbe8d878013e83df65b5f618a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b6faa8c50c89ce52c86274c8c795afb3f00524e7aef4544572df4b5b6b12c6d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f59d0454a2e540196c447dc81e215fde49e60f8d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe5b11bd9034273a78668f95788292b87ad00f4f53e9e4864d3471380b5838b8 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f71b2741f34c5c834d0d29dd4b3f7b1d68ce18ea --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/trainer_state.json @@ -0,0 +1,858 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.333333333333333, + "eval_steps": 10, + "global_step": 550, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.0123395923968e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b562f8ba52aed0807922912b433510236a08122a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c2be48e955e2c663986c0a02a8be95e41f892b19b3871659522c551649551e2f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7198e0149dfe1033c13b72f626445cc43abf0521 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7b357d2796a4ae8b9a9cd2a01df5c70c008cdbabe39dbda116ecf93670570927 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..75311ff97c8628cb71fe6f6cdca5e9e1127d30b6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b6745ab2a92f54dcacb73c3ceec9d54235e5b225134fb7703879ee6185ad897 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ffa4f4faa1638037c7009a7874a8ec2f958a56f3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01b0d6dead233f71cda974ae02165d32469a3692fb9b97739fca51d1798a012e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3ac34f682654267a6b56bca98781b3f71dd24aba --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/trainer_state.json @@ -0,0 +1,873 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.466666666666667, + "eval_steps": 10, + "global_step": 560, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.17620031225856e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b41beba1cedde8c06c922ec068b223c8bc873b04 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:65f5a8daa4f6598605b16cd233f8aeb5bcdd01f6147deb629a65816246171bc7 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1712bfa617b2a59d3ad74da8b30702b8a59b03e1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:58eb2c36003c8057ed2d6fbb4e3d9b1a9f2ea77a3468c11297dff7b83e1ef5ec +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3ed38f9a78b3dbf6f2e73e5bd68681ac198b1983 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d966d92a47b281ed57ee7f44ee2eaa60a54786f7ca9b7e8829ab8723bc8a5a1d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0f6ab5a4a6c1c8537d29396d68ff9a943067c8eb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6a34ac3e6b737b225204c7a1c95f58427255f84b0986866cdac344b9d5ba4319 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ec04ea9e7b04e30d0f8cec7e175464c8ffe1874a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/trainer_state.json @@ -0,0 +1,888 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.6, + "eval_steps": 10, + "global_step": 570, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.34006103212032e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0f2ef9f95b690be939a60d89aa6db64b9b9feb68 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b351cadd5770d49b6b3b8141b2eaad16d527fa781b794f778aed342a17cb14f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f315fb1eec234f8d449e301cc30a546e524d9f6d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a9d76cfa166e5a90a027fbd9632f86665f06d10e6298656fd0cab2169161d721 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6f12baaba3ec135e726e0b75dc20ee8cfe8a995d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:55a6ddc6425602c9554969e2910a1ee66847f95ab8fd86352843e16c6530b2c0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..df177b9452bbc35cb78b91089f310520fe740b94 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3aa3352ae201120fa831c764f5b07fe3f9aa427e68763e4c88ed9af407727f22 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b1ac5151562cf4bad7ba3f98f7278e66262c8ff5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/trainer_state.json @@ -0,0 +1,903 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.733333333333333, + "eval_steps": 10, + "global_step": 580, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.50392175198208e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9aedeb8dcfbb0d703740044f56fadc95110914d6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ad00b4d3bfeb37ae50e412e0b0aed709c8b86b64736301e42179b65b9ac99a02 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0b88aadd205ddc4de634ec46b0c733d69d86ddac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ddaab358d26c8f8318a0d86ad415fd02b2be818c07b0d0a99bd07d56937ff277 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f2cbe02e4922a4920c0a827f09f6df580967beb0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5704b322a17ce5b2788c1247543e3ca9edc36d083fd8ecc8ca80d04334c6030 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e93107ab0b0cdc649d183c879754ed083006f9d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c640cac3d338c5c53c53ad351f9ec822b97e3962fe58e3c4439d6cecd03512ac +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a1ad1ccd52afac7d6c351cabd7f8ec95d0bfb34b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/trainer_state.json @@ -0,0 +1,918 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.866666666666667, + "eval_steps": 10, + "global_step": 590, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.66778247184384e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..93e4343c689042f65d219138b87498e9b8c8aaf4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e1c6adf5af709dc92d818462e72c5ae53d451c7dae000935c79144a5958de427 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c05c59608960250d882b035f53d62fa9a68e4e6b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:322f0bf7abe5d0578a6a21deff6200706cc9092ab3d6090afb37d12660977379 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3d041c10a3af80c2be01488b87e7c23a107acab4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:224b98cd2a3813f8f156af229101dde99ced2e24294f3d7ad7b1538fdc49c27c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e35866f32db88c57fbcc281885df929786abae39 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db64dfcaaa6d2770fdeb8c6c250f6efda7e6b2cbc236d50bf153703fcb63ac50 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f1cc5622e2e84f08a60698488d3fd1adadd6aef6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/trainer_state.json @@ -0,0 +1,123 @@ +{ + "best_metric": 1.9881596565246582, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60", + "epoch": 0.8, + "eval_steps": 10, + "global_step": 60, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9831643191705600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fde2294cab31785aa3b357e69874b97d1d826ce3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f1c31c80d1311c2db70da295efeb50055b9f8f003f23f5ab4ea71d0c8f178c93 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..db596af366c8ef3194f687bd5f66b93111b1846c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87611ef100b7b9bac1604b7ee7ee8b063adcf60e5ca39778d714e6a6dd576850 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ef40b259bc3233779099c3b8651c2fe0a9d07fa5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bbc772ea5a37ab482a5fa0d13a2014584215ee3da6246ff6fe50fb8dafbfb8e +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b706276d2ff18ccd83310c61d87eb2ed9fc15f80 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:efa99c3d03a71b7b58bf8c6b52c8cd63b4d6a19d88cbdc8dfd20580671d183cb +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0f89a5c3358803cab5a99f4e21105d29e145db2f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/trainer_state.json @@ -0,0 +1,933 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.0, + "eval_steps": 10, + "global_step": 600, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.8316431917056e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e5dc38fccb1c4e45f0740dd393b25ee7fe33b818 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c89487d9371879549e3177412f72a693afee70de7fd32206752b7752b731518 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..31534cfbe48a8139052cca61a17df0b1e68f6eba --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9043413f2b848dc0759e8e30941fc4fbe37830df3c2d9363024ca263e497bfe2 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6a970899a5edc16268fdea83560e0495a3d06810 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa5b53289977451ca52671d3897055616936322daf22f6e4246ff72a467aef1c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9ae36545a6cbe48ac387e9a4edd1288050b062ff --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b0de73605756aa391aaa9ea36adcbd12bd865860a2561b0aaca0c704b25cfe02 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..67345b59362301b1de09c763ee389d342c8bbaa0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/trainer_state.json @@ -0,0 +1,948 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.133333333333333, + "eval_steps": 10, + "global_step": 610, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.99550391156736e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..429ce5fced2ced2cae75ff833d6c82fcbf5ccc1f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4a3c43558879f690cbe66667b0fff9e7f2f81eca3fcc4d2e149a998f94fe39b2 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fde21536d03d968520c0b3fd7b98ca727426c0bc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2454510b75f37a2debc5a994797562cdfa0c005442e5df9ceba6877ac526708c +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da7e5f0f7045f8fad1c1529974e555cc67b8f5f0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f2b2ce429e00eba0165cdfd527b7ca384fed68ae5660561d0cbc6dbdd51ce7f1 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8f09b0521c27c995f0878cde37cf7b4138abd8e6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0cecac504a0d6e20c848bc43265028cb51bdbaee46716ad0736302cdd3a2376c +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b06b01a78236e1a023ddb58338de6869a8db19cf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/trainer_state.json @@ -0,0 +1,963 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.266666666666667, + "eval_steps": 10, + "global_step": 620, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.015936463142912e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fa08dde1ce3d6dfafb0021ac2a3a150c2db5c145 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8c443da095929326658e505ac4e16600ec50ea036930a5893558ffba4f025d95 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ff8b9e267b9354808831c0382daeed8a7ce35023 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ad23b34ea1a8794e87755688f0c81056709a2591a808917ab43350357a22c879 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96d7a3f6be074e46014211fae837a521e5c5140c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd6c4f62bed5401eddcf930d960632a48c624bea715ca64cedd7d04db198b4a0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..31b434c7d46bacc0a45ac73d9e6264e373e131cd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d8d4a20c36091528ac87a7edc5845454d614d78ad71a59c7a4ae563b2fe291f +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..7fbaceab69f6b58d831164c4594a24af3b080a13 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/trainer_state.json @@ -0,0 +1,978 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.4, + "eval_steps": 10, + "global_step": 630, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.2722527980804443, + "learning_rate": 5.333333333333334e-06, + "loss": 1.1965, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.3860819339752197, + "eval_runtime": 43.8529, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 630 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.032322535129088e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d6feacba26fcdd26802c8db3eb31d679ee11c22f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6dabd2b53c5a54e4370bd2eae7150d2ceac8db485752e60f274e0f7fb5c5c2b4 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..03d57847c4afd067f0ebe67b489635a487021064 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1257690b0d0eb8f3da913f67bcd5607a3eee190bdf6187e9bc9307989f618055 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bc02fa7e506af341c87e94bd62a6cbdfbd057096 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0597f3b9ac321e002676eb1712670348770197d9b197cdd7a7e16f465315444e +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cc14f7545324288e67e156d036369e2cebdcf74f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd2d3d121570090627f59257118b55358f83f1b060f0fb11ab062387addadff4 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3f1d2b40e0c4299e68abe64673017cebafa8b047 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/trainer_state.json @@ -0,0 +1,993 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.533333333333333, + "eval_steps": 10, + "global_step": 640, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.2722527980804443, + "learning_rate": 5.333333333333334e-06, + "loss": 1.1965, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.3860819339752197, + "eval_runtime": 43.8529, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 2.896117925643921, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.1628, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.3896079063415527, + "eval_runtime": 43.8306, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 640 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.048708607115264e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c78059722e4bb41992e1f58bafbe08abceee607b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9db0fad9a2fdfe76ec619ace6a0805d4b271d24fda27023e7c214c2d99c032c3 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7bb320e39c708f98ad38dbcd62f5a9fd579885a6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63ad46f4ceebfbf04009d676d09d69d0f0a1e5f0c3dde4ab34046a977d32ec3b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d763156eb3a586b51733d4ec683a815a6ae5fab --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e66e316bd2615a5005aac13970f8b8e71830843ea716191e53ff7dc38997af08 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..75f881b3ec9ba86b1878709fb0af361a6f712546 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:31b02e0b7ebaaab7bf8f183e3b47970500df166e496df9fdd39405913db43e64 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d2616e479345f27fe65319085a5f6b6d7ec79391 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/trainer_state.json @@ -0,0 +1,1008 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.666666666666666, + "eval_steps": 10, + "global_step": 650, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.2722527980804443, + "learning_rate": 5.333333333333334e-06, + "loss": 1.1965, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.3860819339752197, + "eval_runtime": 43.8529, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 2.896117925643921, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.1628, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.3896079063415527, + "eval_runtime": 43.8306, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 2.9708454608917236, + "learning_rate": 2.962962962962963e-06, + "loss": 1.1703, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.389633893966675, + "eval_runtime": 43.8494, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 650 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.06509467910144e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8d64dfdbae6bac99e4c04789ca88133adb7aaf5d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b305efce3ddb6c71389f2d829431ac0b7b4bc41cf777f0d19aa0004f46b6e1a0 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..90468de458eeb9b14fd31691c60ad75993dc486b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db9b5547a07887a7facb94ca10bfaed171e0151eec913105c67adb7bc299a668 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1bd0e24dcfea6867dcdb66e0b90f3344dbd9d339 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66fa7ea9452d536e82e5c18c4a0a05615143763aa569d9af13553a06a11128de +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..403f6f78ce81468eb12e4e1c093d8452c7d5a14e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9e3b3eb476269cb66006445e45fa57a95b9d6fbb9998ae81b82199f9b98541e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..331d80d0b26ceac9363cae72c986ebab7d3fd4de --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/trainer_state.json @@ -0,0 +1,1023 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.8, + "eval_steps": 10, + "global_step": 660, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.2722527980804443, + "learning_rate": 5.333333333333334e-06, + "loss": 1.1965, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.3860819339752197, + "eval_runtime": 43.8529, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 2.896117925643921, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.1628, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.3896079063415527, + "eval_runtime": 43.8306, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 2.9708454608917236, + "learning_rate": 2.962962962962963e-06, + "loss": 1.1703, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.389633893966675, + "eval_runtime": 43.8494, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 8.89785099029541, + "learning_rate": 1.777777777777778e-06, + "loss": 1.1655, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.3882758617401123, + "eval_runtime": 43.8208, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 660 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.081480751087616e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..760b805d67198109cd88d5002229130306abc212 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8a8e01f5004f1d115f8064d7eaba8e2cc0c3fb5b7dfcbe5b959c22504ae5041f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ab5b47cc5ca8fc25b9316dfe99c9c2849e396130 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e574401698a0c8fee81d17517d01f058ac8877a48185cbaf255ecb7ded1dfbf2 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b50ed8357a00070f99a52843c3e3d150dbd5b1aa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bb0850ed44e50e4ccb2afc9aab9a80c17a31208454b069930105956f7f9a183 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..72ac93ee4249bf1220c3ed82f099c14ae0267a68 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:886b6be563b163a73eaac3a0ce905ce45ea5202bed173e897fec04ed18434edc +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b2b93e1609d4b3c2e5d0248140f84de898bf01c2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.933333333333334, + "eval_steps": 10, + "global_step": 670, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.2722527980804443, + "learning_rate": 5.333333333333334e-06, + "loss": 1.1965, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.3860819339752197, + "eval_runtime": 43.8529, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 2.896117925643921, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.1628, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.3896079063415527, + "eval_runtime": 43.8306, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 2.9708454608917236, + "learning_rate": 2.962962962962963e-06, + "loss": 1.1703, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.389633893966675, + "eval_runtime": 43.8494, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 8.89785099029541, + "learning_rate": 1.777777777777778e-06, + "loss": 1.1655, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.3882758617401123, + "eval_runtime": 43.8208, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 3.514944076538086, + "learning_rate": 5.925925925925927e-07, + "loss": 1.1841, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.387284755706787, + "eval_runtime": 43.838, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.097866823073792e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a95ee25dc2385beeb7991d29cf46782908342287 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7e091fbeb8edf476c4738e9d76d23e17a74caa81c07a6df24f777223d4d619a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2e40b64a7bbda41fa45616da5a5f0977dbf2af10 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b3f9731081c79863105397159a601877d3aa40982277e239881027885b0f7c57 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bb61823d0d78956427b74dd1a3fc741ba1b2381f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c44717b587bf877ea1a37c7f5747a93e45e34ce231c845a31a9b8a042ee22593 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6f069e8ad5743a7071d53989d6edf25a382b7133 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d33603c9602f50d32bd619f686fa4097b405a474d15f526ce09de1176943edee +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d401a5046a1136def0c47a092147c5cfb43d5f31 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 9.0, + "eval_steps": 10, + "global_step": 675, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5291063189506531, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8725, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 1.9911787509918213, + "eval_runtime": 43.8618, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5129924416542053, + "learning_rate": 6.696296296296296e-05, + "loss": 1.853, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 1.9934632778167725, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6757937669754028, + "learning_rate": 6.577777777777777e-05, + "loss": 1.8164, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 1.9925707578659058, + "eval_runtime": 43.8279, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6200008392333984, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9228, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 1.9931683540344238, + "eval_runtime": 43.8152, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6585628390312195, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8206, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 1.9921337366104126, + "eval_runtime": 43.8639, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.6409647464752197, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9016, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 1.993605136871338, + "eval_runtime": 43.8233, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7298288345336914, + "learning_rate": 6.103703703703704e-05, + "loss": 1.8147, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0075178146362305, + "eval_runtime": 43.792, + "eval_samples_per_second": 22.835, + "eval_steps_per_second": 2.854, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.8521580696105957, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.8016, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0200603008270264, + "eval_runtime": 43.8017, + "eval_samples_per_second": 22.83, + "eval_steps_per_second": 2.854, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0043567419052124, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.8108, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.027850866317749, + "eval_runtime": 43.7992, + "eval_samples_per_second": 22.831, + "eval_steps_per_second": 2.854, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.083213448524475, + "learning_rate": 5.748148148148149e-05, + "loss": 1.7673, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.032336711883545, + "eval_runtime": 43.8185, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1234967708587646, + "learning_rate": 5.62962962962963e-05, + "loss": 1.7618, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0335116386413574, + "eval_runtime": 43.7939, + "eval_samples_per_second": 22.834, + "eval_steps_per_second": 2.854, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.2352266311645508, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7483, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.030106544494629, + "eval_runtime": 43.8206, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.0464248657226562, + "learning_rate": 5.392592592592593e-05, + "loss": 1.7348, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0343356132507324, + "eval_runtime": 43.8173, + "eval_samples_per_second": 22.822, + "eval_steps_per_second": 2.853, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.2262767553329468, + "learning_rate": 5.274074074074074e-05, + "loss": 1.6389, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.060448408126831, + "eval_runtime": 43.8186, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4353421926498413, + "learning_rate": 5.155555555555556e-05, + "loss": 1.5998, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.0792157649993896, + "eval_runtime": 43.8361, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.9593505859375, + "learning_rate": 5.037037037037037e-05, + "loss": 1.6082, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.091804265975952, + "eval_runtime": 43.8087, + "eval_samples_per_second": 22.827, + "eval_steps_per_second": 2.853, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.4801253080368042, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6935, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.1061298847198486, + "eval_runtime": 43.7974, + "eval_samples_per_second": 22.832, + "eval_steps_per_second": 2.854, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6240289211273193, + "learning_rate": 4.8e-05, + "loss": 1.6325, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.1056554317474365, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 2.0203073024749756, + "learning_rate": 4.681481481481481e-05, + "loss": 1.5701, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.1165740489959717, + "eval_runtime": 43.882, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5782257318496704, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.5787, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.100283145904541, + "eval_runtime": 43.8662, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7808781862258911, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.1154894828796387, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.9196659326553345, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.5053, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.1655173301696777, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.246791362762451, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.3751, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.179928779602051, + "eval_runtime": 43.8229, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.105651617050171, + "learning_rate": 4.088888888888889e-05, + "loss": 1.4966, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.18562650680542, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.12894606590271, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5371, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1869161128997803, + "eval_runtime": 43.8404, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 1.844531536102295, + "learning_rate": 3.851851851851852e-05, + "loss": 1.4035, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.1778736114501953, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.221693515777588, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4538, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1807358264923096, + "eval_runtime": 43.881, + "eval_samples_per_second": 22.789, + "eval_steps_per_second": 2.849, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.1360249519348145, + "learning_rate": 3.614814814814815e-05, + "loss": 1.537, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.1719534397125244, + "eval_runtime": 43.8827, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.849, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.0104849338531494, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4827, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.221073627471924, + "eval_runtime": 43.8346, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.3530654907226562, + "learning_rate": 3.377777777777778e-05, + "loss": 1.3511, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.2228455543518066, + "eval_runtime": 43.8291, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.6899282932281494, + "learning_rate": 3.259259259259259e-05, + "loss": 1.433, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.2609095573425293, + "eval_runtime": 43.832, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.332305669784546, + "learning_rate": 3.140740740740741e-05, + "loss": 1.35, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.253192901611328, + "eval_runtime": 43.8607, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.390275239944458, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.3902, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.250471830368042, + "eval_runtime": 43.8732, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.6510207653045654, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.3242, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2574853897094727, + "eval_runtime": 43.8192, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.48366117477417, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.3595, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.246253490447998, + "eval_runtime": 43.8623, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.809762716293335, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3594, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2539680004119873, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.973982095718384, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.322, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.310657501220703, + "eval_runtime": 43.8779, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.9569830894470215, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2844, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.31196928024292, + "eval_runtime": 43.8519, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.851, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.652702569961548, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3167, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.3202133178710938, + "eval_runtime": 43.8561, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 2.6680521965026855, + "learning_rate": 2.192592592592593e-05, + "loss": 1.2785, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3129892349243164, + "eval_runtime": 43.8731, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 2.9108989238739014, + "learning_rate": 2.074074074074074e-05, + "loss": 1.2171, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3091511726379395, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.680732250213623, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3215, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.308401346206665, + "eval_runtime": 43.8733, + "eval_samples_per_second": 22.793, + "eval_steps_per_second": 2.849, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.888770580291748, + "learning_rate": 1.837037037037037e-05, + "loss": 1.2269, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3095664978027344, + "eval_runtime": 43.8781, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 5.66212797164917, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2413, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.3305203914642334, + "eval_runtime": 43.8363, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.077805757522583, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.1697, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3612844944000244, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 2.899893045425415, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2223, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.3476076126098633, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.0700676441192627, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2159, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.365886688232422, + "eval_runtime": 43.8125, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.408560037612915, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.2659, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.357553243637085, + "eval_runtime": 43.826, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.284010171890259, + "learning_rate": 1.125925925925926e-05, + "loss": 1.2222, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3606622219085693, + "eval_runtime": 43.8313, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 2.894986152648926, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2158, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3607754707336426, + "eval_runtime": 43.8757, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 4.2904229164123535, + "learning_rate": 8.888888888888888e-06, + "loss": 1.1941, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.357357978820801, + "eval_runtime": 43.8709, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.144688129425049, + "learning_rate": 7.703703703703704e-06, + "loss": 1.148, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.3823156356811523, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.6422910690307617, + "learning_rate": 6.51851851851852e-06, + "loss": 1.1718, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.390638589859009, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.2722527980804443, + "learning_rate": 5.333333333333334e-06, + "loss": 1.1965, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.3860819339752197, + "eval_runtime": 43.8529, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 2.896117925643921, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.1628, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.3896079063415527, + "eval_runtime": 43.8306, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 2.9708454608917236, + "learning_rate": 2.962962962962963e-06, + "loss": 1.1703, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.389633893966675, + "eval_runtime": 43.8494, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 8.89785099029541, + "learning_rate": 1.777777777777778e-06, + "loss": 1.1655, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.3882758617401123, + "eval_runtime": 43.8208, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 3.514944076538086, + "learning_rate": 5.925925925925927e-07, + "loss": 1.1841, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.387284755706787, + "eval_runtime": 43.838, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": true + }, + "attributes": {} + } + }, + "total_flos": 1.10605985906688e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..25c71aa9e3e0a4c7599cec39b6eb4f41d39af1bb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66f86c0bbb79fa5864673a7b0808523c104148135f9e16c4b8c41a201958de08 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..40050ee7104a6cc3ead9ec59ce9422ef3b37ce46 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5df9095e748543d60bf5f907be316c466c63dfd4b3a02744fde409fb41c3f80a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..2b1c959e3b92a9d3847cd61e595c79a1813cfe3a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bf8faccd3d2ca94b80304c3092e394e13d076f35c0c4f51d74490ac3412d5f9 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..edc613be9a8a7736c1c5e6c411193a18eb94121c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:116e4caee7c9274e6f2a7d93ee5e67e259426d00592030a182ec1bf7e3e1fd99 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..be10485eaaa044d1cd908541d2abce014ffbf0ca --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/trainer_state.json @@ -0,0 +1,138 @@ +{ + "best_metric": 1.9866116046905518, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70", + "epoch": 0.9333333333333333, + "eval_steps": 10, + "global_step": 70, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.14702503903232e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bb32ed3f5cec4d9f305d39c8f0b4f5e4099fe5d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0c715b243a6384c6071c1ad514f550e439d8f08d13f34da020f30cee96838f24 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e2c42439f9b32591d93bf104ed5c0cbcbd640187 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c04726feb3f233016bf1a6cc7f54d78060225d901848aab8068950797da13425 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..0b228b8e8106f666fe286c5d131d496d926a7df4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:debbe8bbbf3d0dfd719072ab48974c332b6f78ebe25ef99f5002c8d0a8c8c380 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8bf25b5c8780313aa53c49c9a020653afda88fbe --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e54696b8c39c3b120a2b1d4d03623aee6400315f6e759074fafe42342c8bf95 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..740f430bc87c028ce6d0053a338341cdb56d35b7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/trainer_state.json @@ -0,0 +1,153 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.0666666666666667, + "eval_steps": 10, + "global_step": 80, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.31088575889408e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2de676dad7a2904c3fb31aaf085986d87156ae45 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e7cd7cac6c9ab7e84f422cae188df5ebb1d597204a58abe8820496639c49f9b +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..90a151063f1377a072ca5e08a0fa6a7c7654a547 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:524b42af78c89d05481b35981c918d4a0db9dcd51bcab7b94139ae89b8019e1d +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4041231f7cc289aaec627b941b3ce1ed104a3678 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e1884689751e2c9aa53b83d7472089621e5727e27a037b479e2287c7b208b1a +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d1e19095e23644fde7d19dd9320fdb8daf7fd2bd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28209e35c6873af016e1c69801c50fdb913d066bb8fab0d3da00cafc566c1a5c +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..2de565602c7c435ea8e169cd30b16bcf52087131 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/trainer_state.json @@ -0,0 +1,168 @@ +{ + "best_metric": 1.9862533807754517, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.2, + "eval_steps": 10, + "global_step": 90, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39686593413352966, + "learning_rate": 7.881481481481482e-05, + "loss": 2.0373, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 1.9985876083374023, + "eval_runtime": 44.0118, + "eval_samples_per_second": 22.721, + "eval_steps_per_second": 2.84, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4074231684207916, + "learning_rate": 7.762962962962963e-05, + "loss": 1.961, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 1.9959219694137573, + "eval_runtime": 43.9488, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.40370211005210876, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9942, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 1.99397611618042, + "eval_runtime": 43.8749, + "eval_samples_per_second": 22.792, + "eval_steps_per_second": 2.849, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.36992186307907104, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9663, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 1.9916906356811523, + "eval_runtime": 43.9664, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.35827192664146423, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9698, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 1.9898818731307983, + "eval_runtime": 43.827, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.44911250472068787, + "learning_rate": 7.28888888888889e-05, + "loss": 1.8419, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 1.9881596565246582, + "eval_runtime": 43.824, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.32972925901412964, + "learning_rate": 7.170370370370371e-05, + "loss": 1.8322, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 1.9866116046905518, + "eval_runtime": 43.8126, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3448775112628937, + "learning_rate": 7.051851851851853e-05, + "loss": 1.9606, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 1.9862533807754517, + "eval_runtime": 43.8431, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.45006823539733887, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8927, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 1.988334059715271, + "eval_runtime": 43.9197, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 90 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.47474647875584e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..3cdfe665600003efbbe47dc966f68cae533b67fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a17f4d6b71c784336ec2bdab63342cacd021f6692de20807ec5843a64f0e22f +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..7ae0b40d2c6987acf514982be986c49fd7ad7993 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee22269b86eaf89f221d37ce0413fee052156dfa6d8121ef01fe488312a0f8c6 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..33708e038cd81dbe4eb6327fe73026b297b12e9f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a11129e30559e49a2da807ddf7dd15022b7a38b5bc413a15f89180f73644881 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d0cb160fc6752dc0470bb88b1ba16dca7ed969ca --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd418aa175a4f9508778329e5c11f54241882ad7316c344103bc3804e613599f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c7c7ba5a5d73c30d2e2dfccf92552709b61b1a0f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4f7e5b3f15e6248eb69742a14f905c700ecf357f80b4e2f91b8b83b2a38d15e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..2c939a0f7af04c87ad337e7a35055771a836a55f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/trainer_state.json @@ -0,0 +1,48 @@ +{ + "best_metric": 2.0317482948303223, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10", + "epoch": 0.13333333333333333, + "eval_steps": 10, + "global_step": 10, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1638607198617600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-10/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8f966f4d4347abf7931e875f43901420d87863fd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:802d9c73b8cf5a1a5c8c5abafa34167fee460dcd482f31f24c94b0dcc359444c +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bdd7d1b77bd1ac30e61f85692c083dc88068c253 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2efa4e303aa246e8a3d96b556cfdd7ccc04ed9e7a3c27ae48631a19ba932b25b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e6cdf36295b4d559507cf0b068680edea3de3a81 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:46513e9b1de488f3d70a4461303e6b827989f588807354e14d010b7ee4f4679f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4f1a24bb7d4e46bd15c0b55412cc8ba9b9556c35 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd2ccdaca083e589c09bcd97757fde390a191ed5c643ace13a70b750fd4a4e4b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..830b944ce07226cdd5afc1ec40500784d7dd6fe6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/trainer_state.json @@ -0,0 +1,183 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.3333333333333333, + "eval_steps": 10, + "global_step": 100, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.6386071986176e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-100/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ab16891e1f87026062b9f7e469180b58875050fe --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b98bb6626aaf45c07faab2b0d8742f773dd9f3ed71c2ebaa59fcdbe5262c140f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..8137a9444bdd5acae590ebd10c981dc4b784b1f1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b93cad49a30b23b3fe2e8580349b0988b998820391d60542f6d83dabfbccc61 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e8b03e39b0cf81b4b723b9421b9fca8f87c7b414 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:319884e2d6c1fad0795ced8add37e8073910c77073120da512a5e6a1f6208d62 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..24dce4e18218617e13af9f93046f397a711717c2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe938637817d41932e7175fe8d9bcdaa1f1383328b73e4b56a4e373476a295ba +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..51ee1ae41fffbd61a4d7a3e87fbbdbe11a08378b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/trainer_state.json @@ -0,0 +1,198 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.4666666666666668, + "eval_steps": 10, + "global_step": 110, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.80246791847936e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-110/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..770e78bb1c3cc9bf43eba9fbcbddbfb30541a5d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f051a8064ccf5920fa967f866324c6accec0aa31a910464ecc1fef2e6abc4e8d +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..6ca282d705a8e63a7876ae11dc3e3d4ea40a574b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa96043eba1914d892967bb61a6434c6fee9d808192693acdcd6465c42573d57 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..71b7a5227226dcaeadffec096acbc7df0f632989 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3500ac793bd5f15c49da717801f854f9815260499ab4bc16b8f3a1ca9c82dfdf +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9eb62bb8ba22966a1e254979e1d2479886d174dd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:145815d6a6480fb85323e9a0f9a98f3e8faa57003487fcac0be85abbf27b4575 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e01cc763eca341231c6c1d29b968020c111421a2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/trainer_state.json @@ -0,0 +1,213 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.6, + "eval_steps": 10, + "global_step": 120, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.96632863834112e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-120/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..45275be3076711029495aa75b262289b3d85c741 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd5117e4ba42bb9b891c3acf368c18170be8168aa691ac49cfc6ec1130bac7f5 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e9e5c4bca7547da59abd1b63735bf69250e24c7e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e5b1648290cf2a7011404f918b36faf01c8224d20e9cfca818ff36914a069ac +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b60cc4cb8217ae694c7a8efef0eb0b676d897e83 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:602f503f7cd2e84c0b6719714b66d34e98b340f44b02ba8ffc44df096e786100 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0dae8e46aca4beacf0c154c37d71abe175363a25 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:abdc7730bfbf0869132cbbd456c580122a20a540399e30640d4e51daf6f379d3 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3dbea020a40b2b3d0f0e1313101cd38cfa116bd1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/trainer_state.json @@ -0,0 +1,228 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.7333333333333334, + "eval_steps": 10, + "global_step": 130, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.13018935820288e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-130/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1a99a50378065abcb9543be3212e58db6bee53e2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4ba98ccda1ec5a260bbd8d293881cdb3cadde4bfd51887d0cbbbc126e4d1a02 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2f317723b4474b3e0d6ee94134d17fd25883f487 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03a271a703e4f55825c6dbf00d0f755584b913eca765a0e160a596542f4af968 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d05f19f3c7e1e4b728f62f56852d18785b6ab4d0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03c218af617af689aa7eff2d02ae91fb859e96fcb9571b641c5e95247f137dda +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b77086a6cbb29f3cd0e1ac947f6c71c390b2dff3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:21a6935970b037ba9fc4b9dc75dbda421fb162f0fa5b7d5502a5e9660c005897 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..2555d27dada473c3f03072f6bd2fcbe0b4638726 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/trainer_state.json @@ -0,0 +1,243 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.8666666666666667, + "eval_steps": 10, + "global_step": 140, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.29405007806464e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-140/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4bcefe8ceeb449a8097fbaf00e78c156fd208800 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1fc23ab4da736a428ec68af52f064a815f10705b60c474a7e08af05d3eb0b2ff +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..cde5c297a0cc093f5d2a07c3415ab7b31b217bdb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:88f393ee30c3df86f8730d14c817a295fc38e4893daeec000516efd703a97ff5 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..61dde1ed8b180510bbda84f0c71356862600ad55 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bdf2188bfe5b1127367f0a0d0628c845d9f54239950b10ed26be9372dba68d0b +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e9d3263bcfa5d62a56c74c931026d6e1762a1781 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d75316f47d5ef08dad7230d3c189fb5ad736372bf2da793895c59a4ccba811f +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..072a5a4f13c21559951264a5a4ada0549db9ddb7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/trainer_state.json @@ -0,0 +1,258 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.0, + "eval_steps": 10, + "global_step": 150, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.4579107979264e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-150/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9e2641bd7c72a307cd6d6ef9edaa0aa07be34e59 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1ec5c5d4d50f36f8bc3e37d974c09c31718567c853a5dcc1e19aba1e297d353f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..355376a86a5073d5c35cce3e280b65a357193585 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:273b88e969438628726bb7d3adab03584ebc1a5152ad9b55b8c2c396315f6e04 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..564fc6da8e7c6b2c0f5b62f1f2e55b96ec29c066 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0f1a4ff62819275ae908067e10e49db3630270d7e753db72e5d286184508926f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..333a8435179bb1a27e74cf71169524425347df64 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c60f731d4cb1d489de80d48b0d2bf2049ddfec30c083dac3c65e6fc26b9708e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..48b8cd9ec497c2e12206da613f6a7a0ddef28a5b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/trainer_state.json @@ -0,0 +1,273 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.1333333333333333, + "eval_steps": 10, + "global_step": 160, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.62177151778816e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-160/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cffff9841a655b023066746cf15a684fd3b7e02b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1eda926850c312ea492b6429f0c264a0368c2e86209816c7c31ddee7541bc8cb +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..56b264bd1b1275cb43a240f36133edd09fc9a85c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:36aee263dbcae8f68057d3dcfb7ecc291a203758145a706d963ed867baa2f058 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c13cd397e2cbe97d2fb9e944d382c58418c6b136 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:964f6178720317ac51eb375c889b2d86c7184aa024caf52b59339853ffae03ca +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..98392616735ef4e842735f8fdb0443dd62c47cc3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f8316c64c3f1dcba9f5f78f5461a5450278d6310afba0a2471aa470b51e14fa +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b76cd85eb15d8068c04c7bf11552e5a67ebb5171 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/trainer_state.json @@ -0,0 +1,288 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.2666666666666666, + "eval_steps": 10, + "global_step": 170, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.78563223764992e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-170/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..51d47cc0745024a97bb714cdbd4547601b46068d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bda8bba87f05dd69207166768dcba39ed213a5905e75c8e8c2fe576275d92a3 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..073727e074be2ac4529411ca87693a13730dab47 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ce8a58b9276603c03b7ee7bbe39f9970d7f6fb8e46a3948ed8dfff5d1a52d377 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fdca3aeb31ce5b4aeb2c0f2ba53e3e43b6334331 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b79baa0842c2916b082cba36f9f2b958210e6d7c1813742841fb908cae57fbd +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4c07d6d39c8000e4887811925b35913c0d0fb9e7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5c510c48cbd7d4a31b049b9ce577d9a61337bf5b3120da8df24159e22a5b61b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f8bc388a7b157b4f3a63f4b3289f897919ff30a3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/trainer_state.json @@ -0,0 +1,303 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.4, + "eval_steps": 10, + "global_step": 180, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.94949295751168e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-180/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..22bb62170169d1f5c342326ce57dc067cac073eb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:11889cf63c022d4ca0caab443b7872f3779d63f492125e682e5ffa0928c7bab4 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ba21f4ec7249acaa523e62eb83e7112d5ead0e6b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4653ec4aff3cf6de1b41e61dc1f556c7583bbac252e3a33b07f08142f1c979a9 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae44ad6727cf9b3af903ea84902fa6c7f13a5a95 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7d6f4346bdc8a12fcc48535a6002ac46345e4ce1e14bb1f7e9dc3b0ea920641c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c8d96687f829fbdebf86c73104630c11643191e8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:56d80eccd9a2998f395870ad7a48e8df26a0ef5fdd75c8bb18466e506b523f6a +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4b413447606956bf01138f962593eead5711d7e5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/trainer_state.json @@ -0,0 +1,318 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.533333333333333, + "eval_steps": 10, + "global_step": 190, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.11335367737344e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-190/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ca4178517a43b3eb6d0fcfdd7eceddbc192f7ad0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:769e7308ae4ec137bf281123fedd83a748b1d94a43ffade40d737e8ef65cd3a5 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..dc3af1fec984eff4fc5f20c0475e19b56693314c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:61c5feff7a63e4e20dba6bd7a22c435a0ccc27b9cb900fcf6f4d976afa36471e +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fe515b4492af517bd45c5a5c7abbba2b94c5ae37 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5087ba42b4dd9dc68875c89890b692068c71de7009ff67cb7d8492bce11049 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..61e40aef0a507fb8add486ba2535aadaa164b9a7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72a91e63074e9f0fdfc6b1e7414643f389732ccfdfe97b6b3f4c5b0d7a7556a4 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ace11c81163e5474944d204f3f169049f120e30f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/trainer_state.json @@ -0,0 +1,63 @@ +{ + "best_metric": 2.0271925926208496, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20", + "epoch": 0.26666666666666666, + "eval_steps": 10, + "global_step": 20, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3277214397235200.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-20/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fb2feedc294a2e016d3fc69a0393da2fe1a27e18 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bd5eb7fc38f229f25240da21ca63fff392bf07ca8ca253e611336a06abcff60e +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1ccdc012928c408d6b397743ddf7ba8fd16b99d5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d49bb73769291f55726094db040340943f5eaf8139dfa13f8dbb865ab93524c5 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da263858f32b7536e68a33626ef41e3ef7a44689 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dbbe288070e588c7effbe11249d330a3ad16131211e6b5dff1d03a8ebc7517f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0fc1f2bea0ca1c9908bf307e3525efa76fd70425 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5b7dec72c2b7f015512ea839980ec16d0582c7e6d0689dad8794261e73838b6 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..78fc7cef0f27954339476d19732e78d24a503454 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/trainer_state.json @@ -0,0 +1,333 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.6666666666666665, + "eval_steps": 10, + "global_step": 200, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.2772143972352e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-200/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1344689f2de605272ac7edb34345c2634399e56e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a5b2b3e26567d421a78613c8df713109031344ce2d6b7130d0f22d9b201cfce0 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fc0116865c832e5af8c04e41fa26fe776b6e2efd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:de673d71e3c543aa9c1844ca9fbf36615e8d7a340f69f949bc8bf2fd7076fbac +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..605214081e6b3060d6c3e526fc86e8b8fff3c71b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd4e0019fadc179e2ea531ff33d86db759cb80e64a8826bb6bfa90c2483bfc04 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3b44a2b0d3df617f15242e2d4ea4d5553b544573 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c59d7cb173602f981a42f5fe61d72e03c87c9f97f456afe9fd66cd09957f177 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a1f1e2e2139850a01055da63fd8cfe02330ad159 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/trainer_state.json @@ -0,0 +1,348 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.8, + "eval_steps": 10, + "global_step": 210, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.44107511709696e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-210/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8b1bbf7082538dae894d637a6e63440f15aafcab --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:36dc7c25289adad44c8dcb9220db18c8097bbbf0f247d36aeaac00ba0b62c8b9 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..518dcc15b4d6b929313ea1999cfc2b50879f4bff --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f50bd8ec30001370019bfaf6104efd25817b8e4cd70868621f8ba26dd0004be0 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..823c878e3ad7d7799e1959fba97c90aaf79af4f9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5e4256f7b7ace2dd6194570c191ab9026456dc0db24025edac4a5bd9e379dab +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..63906caa8cd7e3fc0686b7d0276e496942ef0036 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:daeceb22ea0c54e6923c8a042a9cfc5a5bc826f201c52f29454b62c289d49dc6 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..1f3fd52d9e13ab17764324e22d396a36ef04ef62 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/trainer_state.json @@ -0,0 +1,363 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 2.9333333333333336, + "eval_steps": 10, + "global_step": 220, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.60493583695872e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-220/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5c61c4db041f24c5569c0e219136fafc155e7bb2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:166c4eeb04f6a0cd67601f986c46210599f8047e2735dcbf90d9ce1f007c68b5 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b42ec1e856b9b909bfbd4a321081b33c8aca03ba --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e85709d7f650f6c242381e68c42d56ee67ce88f8dee3803ca199879f7cc38fb9 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae85ad205796b2c3955218eb7b4b348ca35978c7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e2b38199e26ee1965ef79aea019c0217039e7dab109a4b6e29c57f1bea63d6d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3e92e5593d8d7139e837b2a75209a41c074c2e8c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7f755a0bef74517fb45fc39d7689eaec499187cc5cd60002751078b0276b353 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8e5e48c79f93eba43cbf52af7c7a899eaccd86af --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/trainer_state.json @@ -0,0 +1,378 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.066666666666667, + "eval_steps": 10, + "global_step": 230, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.76879655682048e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-230/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6fdb3b754d375f52eaf08805a37f5b7e11373cbf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d53a5238273eb643bf9a1cf2b1d9610dd96045e85c9d23445c8dee2a495d2b97 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9b8cadccc4544d807abaf5ab6c268a9b9858984f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fed3828e8d454551ff64383adda6df8072d2c932d416384c76cb892189ca4ab0 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..846c31e0418b3b3196b4e9c5d730a866c947d1d6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:33d7857a6e3603508425c326c1a1dee439799d2c72bbfc8afcabbb8578757780 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..31e05b86275fed970cdeadc24115c84e19feae09 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f73703efe567bf60e5ab219b736abd5d1183aaab558b64454b92f8bc5cf1b3fd +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b189a02fc86abd883a2eeff132974b3e60cef624 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/trainer_state.json @@ -0,0 +1,393 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.2, + "eval_steps": 10, + "global_step": 240, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.93265727668224e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-240/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..927786ded5386ad2c53510c1a74e040514978fea --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:23ded2017e69f3e95abacbd758aa1aa44093c6ff7df310c5578dd117343b81de +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..15596eef142e92c11f0096e08507036912227b2f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8501db59a783326aa3af73730e3aeacc297ecacd3684484138feb5dde983fff1 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..90df82c0a610ae490c2592c79d46fe23cde8d351 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5b7a10b9f8de84d4eac8f0b5437669695e0a3ed004e055b39340577de17c55 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..174b5438f88f4c3c799b43c4f559ca991fb938b4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7ef310c01f40cba8e9c44af8332d1cb681a7026399804fa2296ed59c6594e708 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ca688055dfd7a4ab41e5fd58721f0841c1bdeb49 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/trainer_state.json @@ -0,0 +1,408 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.3333333333333335, + "eval_steps": 10, + "global_step": 250, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.096517996544e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-250/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..05456667c3e8ef608296c044e44a6c3bbd696a14 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9517c7b76a347f8c671f1e174b469158c565c88a2d78125288469ca405d7f965 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2bc8d0e8977c22fc662285622ce0cfa394730534 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4caefd346b377fdca4231a58cab813d1586114218becf135cce547cebed7527 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..293d181974003fee2540af0648cfb4e42786ca56 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:78bbc69e88d5e1fb15138660b4de76d03b9476fa1ab2d16370f894a65eab3da3 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9431a5b0a8e3a7cfc7a6acff3f3ba51f0ea91b16 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e388642b0db2b68dfc847810d17830763a6c1ccd5a0a2c34607435281dfa7f25 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4ce50c7df431588836e17bdd58d5a108bb87859a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/trainer_state.json @@ -0,0 +1,423 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.466666666666667, + "eval_steps": 10, + "global_step": 260, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.26037871640576e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-260/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0704b9b182bd36721fae337437c79c4cb0ea43b3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:07e04d7bd4133ed7a51555182473db9089a071ef6691abe64db5b4d23277dff4 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..98bdf70271fa930ec7133fd5a77fb332ebab723f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:083ead5aac0b1b6e3bc4e8a76feea5dc09f3e654086a80fc909cc2f63386507b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ba62c782c818c1b90b0344e262a00bb91255dc87 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2af2c0de08ddef877a4af0e5f2dfe4570d2f029659f125fbfe3bbcce3a8b09e6 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ed00d4e9803635011eee9bbdae275cac04953c1b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9b86f25c4fadc98da61c18896b4c25ab399b3a23b766274b50979d4340358b17 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c54b73a16a7d5ff1bac21bf9d11a24de6080adef --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/trainer_state.json @@ -0,0 +1,438 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.6, + "eval_steps": 10, + "global_step": 270, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.42423943626752e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-270/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b01db205f40c44e17486f8c8ca304f00ce09527c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:75178ed93d69b10e7819805b25cb0e20b08e6d8b62a9d3cb302aa0b36867fc0b +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fc98707e98dded5ca0046ce0bb1778c85fe16bac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:da6a273fa6c9323b0f286be0db76562426c857e435dced9575329fff1280bde3 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1702f62666b39cac633a34cf312f24e311e13df2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ba79aaff190fd3ef9f70dd7c0a234665c2bd6c6bb243b5896c5bd6a16356627 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5e2ffe406e2d87ea70e25bfbdad4187edda05acb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d68cb0fb8d225e623592feefec72ecd0b7071657fb56415f262582b52279a56 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e7937ee7315ce2d6129d0b6c0d6c6fb3d7c948fa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/trainer_state.json @@ -0,0 +1,453 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.7333333333333334, + "eval_steps": 10, + "global_step": 280, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.58810015612928e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-280/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..931408ed33003ab868d87ede6423c9361564495c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f3259f706824802431e1c5c655dcb19894855de93d84461eb138b7c670abaff +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f1f0c58ca4111683c37d3aeb166fb346bbe0ac7a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a9a09a1df5989a3a2fffea9cd0ed460ef221f9b01055a45844d822359bccf529 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fecfedbf1488a31afeaf7c01dc4f9760cfff1b16 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47c6345b8afbd1f7a687e942ce33ce022660a29cb46a23e4c9eda9e498053741 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1da17016e7f80351316298af3ab35d6cc666d60f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:518f59f6861d3d54674180d781456c4d55d82eb1d5543c592846efd5b6bea3ea +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..445587e8ace018241a55ec65c6f83a3f55792c34 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/trainer_state.json @@ -0,0 +1,468 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 3.8666666666666667, + "eval_steps": 10, + "global_step": 290, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.75196087599104e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-290/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b5da50c0aecb6eae3e2ecf64e633f09b9549e874 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e826000a763caa9ca064d90345f5de608822e71d56430a3c62aa6efd1b17910a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5279a01ece01447d8e9eb752dcfb6c1e4fca38fe --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:245cb888b5fda223ee35d49a595ec6f4563ea38052e1b023eea085a243b4981d +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..76ee62462f7b8b87edaf24539d12d81995c70164 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3a5478e4e53ebdf948038ed344f6e976416991ec94630cb094a18d5adf7aae7a +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8e3204abc81bf616d4220ccab7f0f13520ce949e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:19debbf018dbf40b240b0a2ef65d5d10de2fa92e61c8838b0319c8c96ad962cd +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..615206852897631925ef5e7b531c5906f85d5455 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/trainer_state.json @@ -0,0 +1,78 @@ +{ + "best_metric": 2.0226855278015137, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30", + "epoch": 0.4, + "eval_steps": 10, + "global_step": 30, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4915821595852800.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-30/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d923f966ac19d1027fb72f2b71cdd08509464175 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b369bca02596d4a853f229b98f69f81531524fd318b9f3cffcd9ab4a0fdf5c50 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1794e09624f598ec0b5ae90d608ca8a7fc516338 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ea9fd7782a2fa7542878e1c2e66b994ecd6efeebecfd8c856912ab4593af114e +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d8ba268ef07796e970a23442889935701a1dda5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2574c6149307e492ef05d2031918a546356cc654f4671c817f05ae6d0764de7f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8640fdd49a163110aee721e1510c7d552b4242d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:30336c219d20749546325363bfa0b5ee5e9d4b073a303024ff3ad347834b8c13 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..7877beddf3d60af7b3d5a01643b16479f575ad95 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/trainer_state.json @@ -0,0 +1,483 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.0, + "eval_steps": 10, + "global_step": 300, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.9158215958528e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-300/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..45be786c9463b2be5d6544b6d88d8609abf56699 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b40d2eda87ef8a39bed036fb912589c211fd7e656a3e15ca7707b6aa1bc9ff9c +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a112487d9327dd59d5ebd503f7d866713fba1b2f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed32e04c1d3a4a4d365f60cfb9808eac14e1a1c8bc335577c9f247296ec2987a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..a5b4503b006d8dec33c7a086d3d007eef4282144 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a82d768c5f5c231c8b50481a409281b8639e231a185281a7476164488eb6c27f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..58f0265f6abd6b6684c5edee08f03cf244492dc5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f67fc10f846f52b9c0359f08a436d3ebec080f189f60c98def04956b2dc83cd7 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..565a770ebed07bbcfea3f86bcf1851b119845a20 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/trainer_state.json @@ -0,0 +1,498 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.133333333333334, + "eval_steps": 10, + "global_step": 310, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.07968231571456e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-310/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9a8d281695d0516a2467de6ac45196afa9fc275f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d448ea72515e22e9bf2c0bfe8191888aef809ee03348b975577c9b9a2cdb76d +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..986ac8c13fecd5d2a54fda777dc3246619f8bf77 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd7f9e0f4cd847ed7da2dc0d7d8c91316a061849677129eb2b2058c4b2c4224a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f5fbaf3739704eea759ab29b4b9eba0fecf79ee6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4f581763059f9808c6971d543bee5e034fff1a9ec174cb7aa232dd9f17099da0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0f02c233f432413573681087f8ebce358efeb676 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae2093149925b534f5c60211635bf0097e5b3bf50dc856b0e3f5b17717e52497 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..1c4cf0883b99550fb235c21ca13d5fc1bb0490bc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/trainer_state.json @@ -0,0 +1,513 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.266666666666667, + "eval_steps": 10, + "global_step": 320, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.24354303557632e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-320/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..38ea5b7c5e44c4eb21ca512c380815f308573119 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ce304d4bfe848d37e1930ec6635ec7bc6b7f31b4972cd9109cb08c3f0329264b +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..737fe5352269aef0a27ea9b4f16829a73ed44bae --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:70d56283a450df25faeb05453c2e059c207782bb9f73d8bf0684d4472ad28cb4 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..759bff60bd0897427bf9d4410df520d35fd20081 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:389caf1bb32aae3a751e11d63ffe273f089df59490c4ac6e5883d944b329df0b +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..2df061fe83adad240544d1899eb2e5e2fb23a555 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:22b896fe763bef96dfe0d570de4fea5d935b3bf80de3a9b1b2918efca334b093 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..75f44c0f1b38d8e68a4cc5640b63aedffdad7fb8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/trainer_state.json @@ -0,0 +1,528 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.4, + "eval_steps": 10, + "global_step": 330, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.40740375543808e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-330/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..31e650b2fbd7757900c4d3f1160e998b6a0d1810 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dd92049ada815ba7e39c8bae254adbf968988e4c9584783ffdf073813309fc9 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..92f9586e01b1bd3acc6719350f527c16a781436a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e05a25c6c046049d56ce670ca1927a66c5901943fcf077e6fbf7e1cefedbc908 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d7fc830aabf2c4827b0609ed6e355d0fa80523b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b904f845552beb994fcd34362e728f918c7473ac27288d463195b51c3ed73bff +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..28521d181a67af05811165bf7cec3a0fcb49ae9d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d03bb25f48f188323d4c5dda872d760e309dedbed641397ec2ec756835c29ac5 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..eb43df40216f4e20d07821c33550e627ab7df30f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/trainer_state.json @@ -0,0 +1,543 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.533333333333333, + "eval_steps": 10, + "global_step": 340, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.57126447529984e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-340/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a511bea30416c9a2404e1537c921ba4358d38a74 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:20afcc2f1fe87d097f5f88d5c04526515bc60228c16cddfe42c14d3d9d3c6173 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b3138ae67a1e4a804faae7d8da40e257a66e3134 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:75b2e65a9d1a1253ec7db9e6af17bdb6eee76e73bff0dafc7d8ebcd26eaafffc +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fc3bb37d365dcd8ae3528d8e7242f7d2eae755b3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39cd0c0a4049d541d90e7c6154cb21167a341830884ad3558195617942678446 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3fc1a8fc07398191149b701be855b2b30b04d498 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8b8cfefd46d2412b7b17da7d799f9e9021312d0b294976f3e87f7063aa01557b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5a4f7b72fdf96a5c231df998263d19dcd30d69a1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/trainer_state.json @@ -0,0 +1,558 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.666666666666667, + "eval_steps": 10, + "global_step": 350, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.7351251951616e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-350/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..30100590266420056b52f4293bf39679a824b5d0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a87ca1a81bff03651fd743e440d8bd1533759bb1c5115d89626c79dd28cecf8a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d74a4bb020d87ad7270d594c88ca632192b47af5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b0358ce216d66dad5e1abae72289b70f1fc351c1687ef3cca3cdcf935a12d919 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..dff7e422d3f8fc71ea77fa33b28878ffbe8abd43 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d73d43b628bfbe3f56e29099c04e9e9584349f935d8148aa8c34849bf03ef49 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..fc7191e0d24d86be98ffef99b67fe56b52160821 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6cb47a43082c3958508d73a1bd58f111764a18725005ed6a37a8d99585cef386 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..6a632b3a8d44e68656a94acc8990dcff98ed53b5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/trainer_state.json @@ -0,0 +1,573 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.8, + "eval_steps": 10, + "global_step": 360, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.89898591502336e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-360/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..28679129dbcd524c03ddf9db2696a5a9d6d99fb1 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:be9b06c83ad0e581a42318bdc3e735b6ff0da447f84285bb2a3985149c554e92 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..319e60e015f455c4de87c5bfa84b5e927b243d84 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1134d7a8e2e197f44f9ac7cf1913b3dafb421a02399322ff3c651009d25caf01 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..792417d4c800bc4c8f7eb21d5421678309a6165b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7c0e313f3d6f9e1adc7603b9ffa6f0ab3438f71ce0c71bd9a788485d02b981c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8bd8e28f2af2a751646ea36889854c5eda0b2292 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1511804f46c0ca65fb38b3cc2eecf2ff9872408b4f80615834923e731745685a +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..df9ec8897c01c1f61a8bca900bba3894d1c38023 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/trainer_state.json @@ -0,0 +1,588 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 4.933333333333334, + "eval_steps": 10, + "global_step": 370, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.06284663488512e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-370/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2626625b6144847e8b4b6c4c33c464a58ac7f6c5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c0d995fbc61a968e2ad62607a7a1b8780380ab8bff4cca049e19229debf0311 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..89f5ca843d1581b51d70dd89387958f9a0d6bb34 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c2c57ec09f55ad19569defd0fe4f76da9e52b5db6cddcd67d5e5d3d97ba3be8c +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f3b952e81c9ed8c37528c0b9d4c13811ac0b62d3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5ce5744fa32738c65fe7785ec589c49d96370233c9386567c3f06dceedb5f2c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d9c94a2d554cb9176e1f6452c1c8064e701f6c9c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:69cb8b9fe313cb48c89565a287ca91c45004877815ee7660be6b701d2464119a +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f17b7b443b95c24fd24711c5ef2306f9dbdbef06 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/trainer_state.json @@ -0,0 +1,603 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.066666666666666, + "eval_steps": 10, + "global_step": 380, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.22670735474688e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-380/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..295d9f0b43caa9ff2ab746b717b781a52e57fca6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5a4f0519038c94bfafe532e7b6f5468fad441c2e4f30c9f2599bb9dacec3656 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..53de04a104edd46ce3e1c89462b83a63791029d8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e61162f6418c60af95bdd9eda3a18a76a46c708b669c5b8c97db82b5125cfe72 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b458d8885e612e71d79c420d6ca3a40dcdcf7fd8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f47a6a8940dea009f3b7ce239248233dd458275df17acc4fa8ff99eb346e8979 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..bf1961280088992857ddb8fe8d4584423c44edf6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6bf472a6dc646995e9eb3a1b728ed47b4f764790f096bc535722b440312b4b49 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8f538c9425f777228160c68aad1019ecc03ad7b8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/trainer_state.json @@ -0,0 +1,618 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.2, + "eval_steps": 10, + "global_step": 390, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.39056807460864e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-390/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5ee4f6d89dbf4c2fdccbaa439b59e012fe31fbb8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bebb4914fbdcffdae6f1495026dd26d4fba685df5034b525a1fd796a9a2d1d9e +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b393d7e88495a8609d2c195a5ab029a8e0ce41c4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6532d6308964c6bfbea4196a8e4be093e2941f0b6d92c0610a1da6a0b27a9349 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..cc0cb9030af17e56f3ab00fc0ad6850b4636069d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b5fde33a4ff115b0a519c0ef179183e0540c837c91cce3dba97312fa8e725570 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1159228ea69439db76026731513cf5c71e57f3eb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f953d62fd365ebab5cb8aad6e7c0cdb075e95f55a4cb36b4f4e0198710f2320 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..916b1576e16074e102b264b09ff034d34fa2a27a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/trainer_state.json @@ -0,0 +1,93 @@ +{ + "best_metric": 2.0181334018707275, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40", + "epoch": 0.5333333333333333, + "eval_steps": 10, + "global_step": 40, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6554428794470400.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-40/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a06dddad9b8a0704d3e18dff50d39c033db50aeb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ebcb1a82112d782ca4debe0266d27b54258785c0f7bcef7d188de3f731506884 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..dbf954e5210f0da7e7b7cd4e51309fc3d8a41cbf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e0bd234db5ae33d0c149a508296a3c1ce34f29248a81383289e9dae21f309f2 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d06e3c475517e0d14c13a6ccad84a3f20110949a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96f529f9856ab8a411ac6b8078e33cfc18c0159c4947cd8cac8e1238fc1754c7 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f91719d8a1b8836b7155587d155c2b2cfc9c7e48 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bbe59d4638e3afc1c337d3e4814ea99d33c22eec7bbc39984af69898855ffb2b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e775e5cc36bb06be7a63c5ff43357af357212951 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/trainer_state.json @@ -0,0 +1,633 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.333333333333333, + "eval_steps": 10, + "global_step": 400, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.5544287944704e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-400/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c6d9b0da1665cbb3f6ba223d8d689ef5d290a20f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5ce8cea7a5e067567855095019f3420af04fb3beb5541563398a1a6657d59b6e +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f052f19a2f44cb7ce9de4641c6b60e43a726bdfd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8522e8e1b957a0a86ed950e4f2ab7bf04307b664a00604b03b3ab3fcfe5a7ebe +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..090a1de878697aa3e6255ed23ff26ce6e561a9fa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2cab01f3c0a9d66cf16eec91d8aebbfd533628e45bdb849b4c3e4ad317f15270 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..390146116c48e62f4426eeb3a1cf7a2ccb90f69b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72422499e547842d9c164e7afacfea53fe3941a7a106527c3755c473fa91c799 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f25ec564d6f8ea25f7cafbb4fe6148dbffa66324 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/trainer_state.json @@ -0,0 +1,648 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.466666666666667, + "eval_steps": 10, + "global_step": 410, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.71828951433216e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-410/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1e44b5d7efd5d52621a9c65a69fb0484d6607367 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:80f7eb3bcdd7b1508065f6310168caadd74d6579a9ba12f06f43263359a11747 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..99500a277b7372c565048d21aa0b0b5611d0f455 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f418d1bb08bebb0edf37c68917064191d8e91219f74ff41fe0cf6e65e287e3d8 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..7c168ba589ab149907f65c12980a55da76890995 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02f02c3c7264962c7bbb05c73c2c2f9530a34cf2c29d550cdc787ae19eb6d9bb +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6b66301b7ac8ccf1308c1ac8d63d7000259489d4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5336fb81030d9ecbffa34471d17a4c3981e781c865d7ff7a9b59e360e4230577 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..de6455cc33d0d1f100ac38b5753ad678ab9b2f14 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/trainer_state.json @@ -0,0 +1,663 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.6, + "eval_steps": 10, + "global_step": 420, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.88215023419392e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-420/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d1ecf6cd6da0bf47d9319ae34c32fbd803e30df5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63b2b39a222388db4eec6c17ebcfbfbcf33672e1ab112c94d754d111b2134ab9 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b8e2d3cacd940e2ce1478ba6a2ab48c5daf7fea9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:38654e37bb27b54cec74afa47f9e3478555ca678b5f34c50b109cfcbe77d9b57 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..eb08c850753d158caff59458c0a4d2fa22ad5de8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f5c1faf0e9eb010c64f51b35236463635709da903fff7194839666558e862b6 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e44e26be68a19106ac45dab84a43a732acb91528 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:579c34af7d7ec0609fbd3479f4f8d8571c4cef90c76d9f6bacf43740f58855d8 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..389a0f18946e7017fc8ff5a3c2370caaf3c68965 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/trainer_state.json @@ -0,0 +1,678 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.733333333333333, + "eval_steps": 10, + "global_step": 430, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.04601095405568e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-430/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..31e3e9c2e459d47c00b66a33951bc6a9438198fe --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:50bdf3f107c52a8525a59197b0d678b15d54a0e81a0580158955e98242788ca7 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..05af3c5ac6b69b40f73705b9b137c3623dae6f52 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ac83973049358d9e4948ebfb1c313c3f89b39d44bb47ffcc63a210f9f83ec49b +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..5fdc5e50e381540856fecccc6c375074d1aa7b0a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54abee51bb88479cda4bf77e85c2a545e7fb3c5e42f56d1baa63f1344dcc0529 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f8f2b85f23363ba098112683059a3e46233b6bfc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01a563f529b13f402d286b14bda74d3530e1fcecb2bee786164bfa1339da3729 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c459b783bfe5d3acb0f26d51ce3211d73959cace --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/trainer_state.json @@ -0,0 +1,693 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 5.866666666666667, + "eval_steps": 10, + "global_step": 440, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.20987167391744e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-440/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e918f70a62b2f3c94da9717da05023e931cb0d08 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c1858e9c41b74dd266897b80e11c9ac281d2900e60be6ddb9cf5f81fe9f6dc9a +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c3eb96a2e0a246e70008fc7e5ac4851e802469c0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ea3e6680de4296438b9ff61f990f2fae57a04035ced4be84712e95e476014615 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3e7c44b011328e871a23ca1fea7cc6ea78d70a29 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4cc0a8131f9f14b855b33975c5e795a94be3a332a0f3cf68a9ec3ab6ce73b177 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..671e99d731836dff5ed479ba9e24ab368c795616 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb8360cb66be4e8be27b2f376c800950e3f00449fb6491d6247165f9aff23820 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..70e6fe15c0e14e1abc91f204ac28d4e1083990d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/trainer_state.json @@ -0,0 +1,708 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.0, + "eval_steps": 10, + "global_step": 450, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.3737323937792e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-450/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..f41f2597c71d4af7202c47bc54a223e7e0b2049c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28453c9bf46c5e9f87976ead5ef58c34803edcad8d03dda83f185d4473d5bafd +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7ef2f6b4c2f411c34102dae9db06c73e4823666c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:86efe63f4c771402bea1a4348834e1abf29fb6a77f38a833757fd30f64bff141 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..82f7415495fcd1c3ffb5dae79c8c3a4c2269faa6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a6424cc1a4d391795fbea6a94823363dca21ce0e7ec6c433e8cb5b0aca0060f +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..82f0764fa1ca7bd5d0d2c27e699e54f97149a9da --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09ce894ec673ae7c851228a15e2e8a3dfc488203c01cbf434a7c4cbec9b7becb +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c672966c87c86964d5dc397a10384850fd8b2b32 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/trainer_state.json @@ -0,0 +1,723 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.133333333333334, + "eval_steps": 10, + "global_step": 460, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.53759311364096e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-460/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fd7b8948d80d6af12a4d4fdf9b770532838fbaa8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d17d53139edacc90398a24a4c7f4e360abb31ab75291b35a8f84290f93db308 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..908cfdc8c366ff1f3b064f413e76775f63cc0f4d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e6a40c981bd5b3afc32d9f415496c6e2c13a1366834f3c2a773a4dd8e7a206a3 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..84ca1f63cf231e2aa1c43b465c46ef11c80bc867 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03fc4a1860f68759a4d7833f4317681e377d4e71cf91ab1f091da8cd71579d26 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9e4c1530ac9944d4b54caf372d4f9930c6597321 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee66a0b6b4d05213664fc79a1ffd83a3bbefdb7154906787c3ef06bfdc4539f5 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0e0566bfd804b61b5982463c22d5d738ee8530b6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/trainer_state.json @@ -0,0 +1,738 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.266666666666667, + "eval_steps": 10, + "global_step": 470, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.70145383350272e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-470/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..111e84cbad21f6461b017ea49d90d29a6654e683 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d13bbf9f8b5bbe0f24cb6e1a3447a3cac3ebe2a8063dbc229f2a5325f39a5e1c +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..94d84ae53b411e41fea6f110c412c5343584e0e8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5a33d0f8de1cd1c18c873ba80822e8cdbc543d2cc5f88c660568e73e24642397 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..302025be6f88ae472170fe5d230ba39d4ec976df --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:918d6ec8ede8d7a880512e2fc44b16d7c22df85e8b411a004d142edcf446c40d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9c7a583c3e236b2f110dd12004cef1d9a2b13311 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:535824b66976a8cd20163034000bf2ae1a203551ed6ea6132858b6421f4024c0 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a3816498fa978ddd155bd2b5b968a32a5350050b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/trainer_state.json @@ -0,0 +1,753 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.4, + "eval_steps": 10, + "global_step": 480, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.86531455336448e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-480/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b8185d20f70a7dd687e5bb7c67647e4c6b2b5f31 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bf839c7c1b0d6e84c55691ecc6821e19ee062289ca48b325e6b1f0f0ddb7f297 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ed6f47282118c055284f52b1cbadcaa149f6657a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1650416fca39524c3e3f3559c201cb7a9c636998c28ff986a906af95256599a9 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..031b265de35950a615eacc2c86e46292f552e541 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b56a3ff26dded8216d560cf73ba4817b5973851b78edbbf6aa9d6b515761df8c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5c509b1230d4d9d9bf05bb1cf38bcd2d3119d2c8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:65f3e63eff29379b2f31d4f746c0c715c2b686bd11d7e07aba3d5f29231a18da +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c75b5359ad1e06d7e2a35f152767eeb85c5b7c10 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/trainer_state.json @@ -0,0 +1,768 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.533333333333333, + "eval_steps": 10, + "global_step": 490, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.02917527322624e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-490/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cc342f6286b0839592ab9f1bdcc1a5e883245932 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:85b21e7de225755a006d0501da3d295381fa2805c9a1cd72a57faef7048ace92 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..cd483d5e433f61f3a63b30583cd23b50b66f7ca4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0e0bea2c601baaa4f5e4059475f0096be5e36452c424eed199b8813fed08c733 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c1fc54eb4786e9f15244e8e4274b14688b87da5d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7062fa0264c6fb17100531852b46c235ce631a6626d5e19749a65ba8723532c0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cee24f7781db565e483521e84ddc6dd277a07ef3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f79415c3ece613ed89d676bff22f42086790a2bced0de6758824fb8c7e27fcc +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..dcd51c0455f676b4dcb82fe495afaaf597a824ea --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/trainer_state.json @@ -0,0 +1,108 @@ +{ + "best_metric": 2.0160505771636963, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50", + "epoch": 0.6666666666666666, + "eval_steps": 10, + "global_step": 50, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8193035993088000.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-50/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..746141794218a74c4071bc2a6af7bffa4038c69c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:05b1d8074c78a80cee3d0cfc2cb8ea1d9ba94b1f9eb593b508cdacdc73673bff +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f396b157e3d53f642d85fd6cd52328ecac1ea4ff --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cf3dcc1f37f4c7199d7f01b592d6cfeb6c672a6804a3dd61f773ab5194b25944 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96edd96602542afab3935d537c8d1428ce43196b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:beda198a64f1e6f1db0895ff6a6859c2af4c98fbf9c15d1daa4dcca9c20f50be +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..36002f421a8027f0e22e1cea8d6c317eebfd0e2d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e63d56828d52c149ac34c43bdc2adc48c363068c94b9a3df26528670b68d615b +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b77da695c02ca9504820b3c9335197df2f385d5b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/trainer_state.json @@ -0,0 +1,783 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.666666666666667, + "eval_steps": 10, + "global_step": 500, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.193035993088e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-500/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c005c30191d290b0c65bf8665ad186756a87096a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6d3e21f663171eb0fb60d20336b97fd574d940f33c0e4b5fe90fdca64b51f57b +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bd68581e7c01ecc41b6574699e9e80b9194191a0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96b3221ad7756f0405e611a66bd601536a778564ad97ca6f5f1ce9a7b5d7cd9a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..52b85f2bd42c764f793cd9aa8382577ad1b51617 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:156b16fe2af6b1592b431fe36919ba4914ab9e672f318f884f5045be66654277 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..a5298f7a45852e72ab3264eef95969ac26ee5012 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14a2456b0fb437e597f1bc67f02d12ea64caadba3ce80e5a7bba56290d13a10e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ccc80422b312d7425fdef355f92ca96f90d9b995 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/trainer_state.json @@ -0,0 +1,798 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.8, + "eval_steps": 10, + "global_step": 510, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.35689671294976e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-510/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8a36c732b26c74d535ada1f6eb0dd325ac3d7715 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bd121c3a97fa98e5290106291bb50f94fb7579df86320966a8ad6a87aa3f3485 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..db583ad423d2e773410e977a4954560843253e82 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0c49801ec9ae5e260584f70a6ab2135ee8be4e55823c2fccbbd2d46904674e9a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..736afdcce42e3e1d5dec3aedeed239bc0b63975c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca29f15bc2264125f00923607dbea007ec921af3e528271a2bb77db5cd4d2b66 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d470aca3bc75a59cd83f65a7641e2227523184b0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54453f7799a2c12a65729e49535ef0d1133252bbba34418ca96403f477d1ed92 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bd10c7e8d1a87dc15b5ecc39ef9867a56734d37e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/trainer_state.json @@ -0,0 +1,813 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 6.933333333333334, + "eval_steps": 10, + "global_step": 520, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.52075743281152e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-520/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1470c34098b3df0c6e29c189bb4a711ae62291ce --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2b36fcf6626708b59652405b7cc7d8361e4096ac1b21af908e3c93c4e1d19ff3 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..81852b69ad09cfa6551e81bf10a375487485626c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1f6ddecb6e983dd816e3f0db1b3fe7ec91f859b4f9aa01bf7513df353b02aa61 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b0413aa128dc89fb63c7a74242ac1a6da3ecf5bf --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9436217a6dd3838565d7b9845d97ff2e933eb514cc6ac99465ebc3448de3312 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3038016ab1789281fcb7570057f9ac7ff03feda9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67026c5b7b6af0a730215316d61a8dcdd8b26b784be7a50e23105aea365fc01d +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..87e4d37e105a43e4c14b4a40b6f99767dd55d22f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/trainer_state.json @@ -0,0 +1,828 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.066666666666666, + "eval_steps": 10, + "global_step": 530, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.68461815267328e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-530/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..855ad1145acc49eda7b488c91714302c69a35d74 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14dcca323da52e00cd24a0098e69f095e378fef0acd3465894f70f7a5d4fec80 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ebca16c0b87e94ef9ef215bcd9eab496c265e494 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8637167e90e06895b8900e89ca2dd9e74423c48f194913a0fa574fbbfac55cd6 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..8d48caf21e655a01d7675a2b465c934cea676943 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:816bfad4f86e01da7fe3bd5bf7d10c902cf135a5b5fec9e0170158290fe5828c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4191ea1c11397f76dbbb9677283fd3b541b6e689 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:61e7bf31ab25b6a7b2f0902a2e1f6ca5545ad296580f627246378508da64fa41 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b9766a39e3378fdc802f793758754be3af1b7473 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/trainer_state.json @@ -0,0 +1,843 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.2, + "eval_steps": 10, + "global_step": 540, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.84847887253504e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-540/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fed0a66a32ee83191629be2f5dd8a13c9f8d4fd2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca02e5966883ddefe092b35e5cfe0e1f06719fa1311d7de4318c993c28e3db7c +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..3be078807cd5592e01f14fd7bdb6dfd2deedf172 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e99c05dcb97171a0ae17a2de00c86f5fa958481fe10380e8cea7952a5694ab2a +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..9dc1ec111f2a6f7fbe8d878013e83df65b5f618a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b6faa8c50c89ce52c86274c8c795afb3f00524e7aef4544572df4b5b6b12c6d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f59d0454a2e540196c447dc81e215fde49e60f8d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe5b11bd9034273a78668f95788292b87ad00f4f53e9e4864d3471380b5838b8 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c18969845a8920ffe07ed91dfcbce47404adc141 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/trainer_state.json @@ -0,0 +1,858 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.333333333333333, + "eval_steps": 10, + "global_step": 550, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.0123395923968e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-550/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bae142dd59df159240e1f052f4494189159d54e5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f152542b2cd9aaaabb232dd7b2092f77c530ec758db2c9a717f148ed09301ca +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..932df9cb1db8350f7dee2e7e4380a31d279396bb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03296d5ed339a6b57a0d6418783d73cc23d5f2e039e960d978dfce1ef3158eef +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..75311ff97c8628cb71fe6f6cdca5e9e1127d30b6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b6745ab2a92f54dcacb73c3ceec9d54235e5b225134fb7703879ee6185ad897 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ffa4f4faa1638037c7009a7874a8ec2f958a56f3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01b0d6dead233f71cda974ae02165d32469a3692fb9b97739fca51d1798a012e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3e179ed8efbd5adc255ba3bdb46b33d20fd36e36 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/trainer_state.json @@ -0,0 +1,873 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.466666666666667, + "eval_steps": 10, + "global_step": 560, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.17620031225856e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-560/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0127e4030b7a727073f019808df45ff06371059a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e1e86d746663769cd19f42a06beea74eddaeb9f649ecfccc83d6924df2754137 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a6b9aa9a6225bb04d7ffc9b4c1370c6b4e1a961d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:660b47c81798d684ee7a5601f139f5ee1c9f7fb9ec3237ee216046793dc09606 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3ed38f9a78b3dbf6f2e73e5bd68681ac198b1983 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d966d92a47b281ed57ee7f44ee2eaa60a54786f7ca9b7e8829ab8723bc8a5a1d +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0f6ab5a4a6c1c8537d29396d68ff9a943067c8eb --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6a34ac3e6b737b225204c7a1c95f58427255f84b0986866cdac344b9d5ba4319 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8359aa6e3fc27a22c638c97275ceebefed4f689a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/trainer_state.json @@ -0,0 +1,888 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.6, + "eval_steps": 10, + "global_step": 570, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.34006103212032e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-570/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6b27163f5cb164762c746875e35bc808131df49d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5022e7e41f9a017601fdb9b2909bfe90ec2bccd76d141940bc363a68d4bd865 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d43590e2ecd82a83e5e8a43867d3fe03f348bd1d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:71d108fc0a3cd233d831bb875bf1b58a392dd4120e9b4f523bc7e74483839dd2 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6f12baaba3ec135e726e0b75dc20ee8cfe8a995d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:55a6ddc6425602c9554969e2910a1ee66847f95ab8fd86352843e16c6530b2c0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..df177b9452bbc35cb78b91089f310520fe740b94 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3aa3352ae201120fa831c764f5b07fe3f9aa427e68763e4c88ed9af407727f22 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..906ae8fe7cf1b9ef4bdb9fbaa0e878046e733261 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/trainer_state.json @@ -0,0 +1,903 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.733333333333333, + "eval_steps": 10, + "global_step": 580, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.50392175198208e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-580/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cea00bf127f6443160c8dbd31c5db2a01c831c69 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f530b3c849f1b2f99ac8a31e8db173fa0e972bfa516a278777ee726d746dd09f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..3381ede289a1f777b0eef31ab54a9c914e4fd901 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5a3a4642154c7a58b1cc0d659e8318f304b274e7f01d7620c5eab93155017576 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f2cbe02e4922a4920c0a827f09f6df580967beb0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5704b322a17ce5b2788c1247543e3ca9edc36d083fd8ecc8ca80d04334c6030 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e93107ab0b0cdc649d183c879754ed083006f9d7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c640cac3d338c5c53c53ad351f9ec822b97e3962fe58e3c4439d6cecd03512ac +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..7dfadf52e9c43254210f2223b335cadddc2ef597 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/trainer_state.json @@ -0,0 +1,918 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 7.866666666666667, + "eval_steps": 10, + "global_step": 590, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.66778247184384e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-590/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e0645cfba89c7a2585b9840fe1189f702eeca84a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a1d9c4ea3c87179740f6ce0e1af5b21da76a032d8f244038ff6641342f576188 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..42111b7a44155de95df7561c2d83523e76b38707 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67e32199ee8fbf1fdf5406260895df1d9bf6b7b6b83d4ddfdb821f471f970190 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3d041c10a3af80c2be01488b87e7c23a107acab4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:224b98cd2a3813f8f156af229101dde99ced2e24294f3d7ad7b1538fdc49c27c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e35866f32db88c57fbcc281885df929786abae39 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db64dfcaaa6d2770fdeb8c6c250f6efda7e6b2cbc236d50bf153703fcb63ac50 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..131ed16059ed6b4166401c6e7d262c08a4a9d1d8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/trainer_state.json @@ -0,0 +1,123 @@ +{ + "best_metric": 2.015014171600342, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60", + "epoch": 0.8, + "eval_steps": 10, + "global_step": 60, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9831643191705600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-60/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8144a76ffc8e405892beaf231a4d3f9e8493ea00 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6fa84987099a84d9cb5052c850634d96879515f74c4bba6bcd9431cc75e60e01 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c0bc46d78964250e8ba01d329beddeb459cc5f4d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6641833b23f79d6d27f92418d72b6c5efc927ca79c854dc041ee6bad45fb0ea6 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ef40b259bc3233779099c3b8651c2fe0a9d07fa5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bbc772ea5a37ab482a5fa0d13a2014584215ee3da6246ff6fe50fb8dafbfb8e +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b706276d2ff18ccd83310c61d87eb2ed9fc15f80 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:efa99c3d03a71b7b58bf8c6b52c8cd63b4d6a19d88cbdc8dfd20580671d183cb +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..9427103642fae2232c91f055a7ee01e0c28ce4c5 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/trainer_state.json @@ -0,0 +1,933 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.0, + "eval_steps": 10, + "global_step": 600, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.8316431917056e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-600/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..52b022a4156e99b5dae833ba18f9deab8ee0c3d3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5e158f1c72f415ec105bf56afce72514a8315953336d332da61b605690b710f +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bb8437a3a604f2c0ad1dbf6548df595615c09587 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b8337ac5202080443098fa45868c7b19fd6fe3b610745c224588d0787fd4d1cb +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6a970899a5edc16268fdea83560e0495a3d06810 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa5b53289977451ca52671d3897055616936322daf22f6e4246ff72a467aef1c +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9ae36545a6cbe48ac387e9a4edd1288050b062ff --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b0de73605756aa391aaa9ea36adcbd12bd865860a2561b0aaca0c704b25cfe02 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a7729748be2490119bef67b4c1067a9b631d530f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/trainer_state.json @@ -0,0 +1,948 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.133333333333333, + "eval_steps": 10, + "global_step": 610, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.99550391156736e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-610/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c305cbdc44dff9b0af414d87a08fd2491cafa9f9 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:16a5f8430224899cb5a955b74fecbd3e14f3c7a857a58f21f7d7d3bc434a6cec +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ffcb8cf0e4d3fc2eb9800bb86458068d00b3d9bc --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b79eff46217dd5a339d943c048a6ffb3977d0dc1430a62c8bc06bd6fb366f020 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da7e5f0f7045f8fad1c1529974e555cc67b8f5f0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f2b2ce429e00eba0165cdfd527b7ca384fed68ae5660561d0cbc6dbdd51ce7f1 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8f09b0521c27c995f0878cde37cf7b4138abd8e6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0cecac504a0d6e20c848bc43265028cb51bdbaee46716ad0736302cdd3a2376c +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..aabf4b8f374777908e166fbd04aaaedcacc550f4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/trainer_state.json @@ -0,0 +1,963 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.266666666666667, + "eval_steps": 10, + "global_step": 620, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.015936463142912e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-620/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e9134306f1e65bddb5d9f0a40037479fe234aae3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:15494e8198fa84f42e36b15aca0f1a38e4cc28e6cc24b62cef8406cc345d8731 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..54f296e92ab9fb14a83cb905aa7fbcf345ce7003 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e69d42bbcc584d0e1100cf0e75a9d5a1f0ca837e47abd2ceb9af56e58a10e5f8 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96d7a3f6be074e46014211fae837a521e5c5140c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd6c4f62bed5401eddcf930d960632a48c624bea715ca64cedd7d04db198b4a0 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..31b434c7d46bacc0a45ac73d9e6264e373e131cd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d8d4a20c36091528ac87a7edc5845454d614d78ad71a59c7a4ae563b2fe291f +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b2de6ab19152b135dac76ef52ca465e46602cec4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/trainer_state.json @@ -0,0 +1,978 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.4, + "eval_steps": 10, + "global_step": 630, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.183739423751831, + "learning_rate": 5.333333333333334e-06, + "loss": 1.2938, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.4073400497436523, + "eval_runtime": 43.8406, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 630 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.032322535129088e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-630/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..3c8723234511dbddb7a7a22d4646b83df0da110e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5ff8305382e71492421692a59a46df72af4fefa90261bbfe9f7a9ea3353895af +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..cd9bd0c2fb31d3188c9778c8a4502bf9c2a87604 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bcfd6941ef88162426e6d3afb9acc723873e3d6b42bde36e295bd7fe769d5c36 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bc02fa7e506af341c87e94bd62a6cbdfbd057096 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0597f3b9ac321e002676eb1712670348770197d9b197cdd7a7e16f465315444e +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cc14f7545324288e67e156d036369e2cebdcf74f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd2d3d121570090627f59257118b55358f83f1b060f0fb11ab062387addadff4 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bbd82ddf79b61945a9093c511f4e3fffaa980e75 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/trainer_state.json @@ -0,0 +1,993 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.533333333333333, + "eval_steps": 10, + "global_step": 640, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.183739423751831, + "learning_rate": 5.333333333333334e-06, + "loss": 1.2938, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.4073400497436523, + "eval_runtime": 43.8406, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.044917106628418, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.2809, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.4101006984710693, + "eval_runtime": 43.829, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 640 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.048708607115264e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-640/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..f738dbfb03b701bcdf822fe9261565eee619221b --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47068f3009f9aad8970bb891222947b79c232d30a1a5abc8d37a06563507c312 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1f08712cd9eae5e11e948007d2bc09b88b2c9613 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:15ec656aa9e2c5ffed6f2fed27dd6a9ce54f8d5bdba3c87c94153afae1df93ba +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d763156eb3a586b51733d4ec683a815a6ae5fab --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e66e316bd2615a5005aac13970f8b8e71830843ea716191e53ff7dc38997af08 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..75f881b3ec9ba86b1878709fb0af361a6f712546 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:31b02e0b7ebaaab7bf8f183e3b47970500df166e496df9fdd39405913db43e64 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..999d377e007b00eed96360a901019dd20ec9df5c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/trainer_state.json @@ -0,0 +1,1008 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.666666666666666, + "eval_steps": 10, + "global_step": 650, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.183739423751831, + "learning_rate": 5.333333333333334e-06, + "loss": 1.2938, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.4073400497436523, + "eval_runtime": 43.8406, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.044917106628418, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.2809, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.4101006984710693, + "eval_runtime": 43.829, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.301975727081299, + "learning_rate": 2.962962962962963e-06, + "loss": 1.2257, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.409301519393921, + "eval_runtime": 43.831, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 650 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.06509467910144e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-650/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..395b1e858876ba4193bc027fcc66ae0c688a9aac --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7c78e12df07e7cb00a97a923157e07291432c748a30081382c74959d71126f16 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..cae8e1f2761ab8496cfc6cf31984d7711e7ad1a7 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d87baaab4fd95a7b4dc3177271763cf86a087bdb6e884d1e120b4070efefbb72 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1bd0e24dcfea6867dcdb66e0b90f3344dbd9d339 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66fa7ea9452d536e82e5c18c4a0a05615143763aa569d9af13553a06a11128de +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..403f6f78ce81468eb12e4e1c093d8452c7d5a14e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9e3b3eb476269cb66006445e45fa57a95b9d6fbb9998ae81b82199f9b98541e +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e6272a290b5e4fdbeed6289a42a591f6f33825a6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/trainer_state.json @@ -0,0 +1,1023 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.8, + "eval_steps": 10, + "global_step": 660, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.183739423751831, + "learning_rate": 5.333333333333334e-06, + "loss": 1.2938, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.4073400497436523, + "eval_runtime": 43.8406, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.044917106628418, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.2809, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.4101006984710693, + "eval_runtime": 43.829, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.301975727081299, + "learning_rate": 2.962962962962963e-06, + "loss": 1.2257, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.409301519393921, + "eval_runtime": 43.831, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.4970486164093018, + "learning_rate": 1.777777777777778e-06, + "loss": 1.2425, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.4106786251068115, + "eval_runtime": 43.8162, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 660 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.081480751087616e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-660/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e3030b734b300ac03c1d9c8d6aa9c7e330620f90 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c044b34e1aecc827bc360ab0e6ccf4f7d0ad9736646132ae8180525cb03fa0e2 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bb8e587e45391ccf9c84aac494c1bc54819daa3d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:58119c0f61185606a2dad68e1ca616af411d9432aee0d32d9672e8a78837f1b8 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b50ed8357a00070f99a52843c3e3d150dbd5b1aa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bb0850ed44e50e4ccb2afc9aab9a80c17a31208454b069930105956f7f9a183 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..72ac93ee4249bf1220c3ed82f099c14ae0267a68 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:886b6be563b163a73eaac3a0ce905ce45ea5202bed173e897fec04ed18434edc +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c2142d47b0e3ceca4ddeea141dcb9b36829f9ffa --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 8.933333333333334, + "eval_steps": 10, + "global_step": 670, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.183739423751831, + "learning_rate": 5.333333333333334e-06, + "loss": 1.2938, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.4073400497436523, + "eval_runtime": 43.8406, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.044917106628418, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.2809, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.4101006984710693, + "eval_runtime": 43.829, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.301975727081299, + "learning_rate": 2.962962962962963e-06, + "loss": 1.2257, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.409301519393921, + "eval_runtime": 43.831, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.4970486164093018, + "learning_rate": 1.777777777777778e-06, + "loss": 1.2425, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.4106786251068115, + "eval_runtime": 43.8162, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 5.311999797821045, + "learning_rate": 5.925925925925927e-07, + "loss": 1.1569, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.4109017848968506, + "eval_runtime": 43.8901, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.097866823073792e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-670/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..91da455cfeb694cd48ba1aee262d1bf739824c8d --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:61950d13f26ca02c95b5e334e8ba28d16686b7776171428535109cc7c1d85549 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..31d647922aff96165c885803ba1d619cd6c8059e --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a27b7857571e0b74bf4e4b504c42934878e3cdbe76b625790d49fdd88c389902 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bb61823d0d78956427b74dd1a3fc741ba1b2381f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c44717b587bf877ea1a37c7f5747a93e45e34ce231c845a31a9b8a042ee22593 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6f069e8ad5743a7071d53989d6edf25a382b7133 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d33603c9602f50d32bd619f686fa4097b405a474d15f526ce09de1176943edee +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..037a8872b99b6b8f6a0f9b92ff71e207a0a7182f --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 9.0, + "eval_steps": 10, + "global_step": 675, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5929412841796875, + "learning_rate": 6.814814814814815e-05, + "loss": 1.8689, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.013707160949707, + "eval_runtime": 43.9223, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.5305164456367493, + "learning_rate": 6.696296296296296e-05, + "loss": 1.9302, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.0158851146698, + "eval_runtime": 43.9185, + "eval_samples_per_second": 22.769, + "eval_steps_per_second": 2.846, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.5468393564224243, + "learning_rate": 6.577777777777777e-05, + "loss": 2.0069, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.0156633853912354, + "eval_runtime": 43.9264, + "eval_samples_per_second": 22.765, + "eval_steps_per_second": 2.846, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.6919519305229187, + "learning_rate": 6.45925925925926e-05, + "loss": 1.9791, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.0175247192382812, + "eval_runtime": 43.9214, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6861239671707153, + "learning_rate": 6.340740740740741e-05, + "loss": 1.8669, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.016254425048828, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7881955504417419, + "learning_rate": 6.222222222222223e-05, + "loss": 1.9967, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.0169668197631836, + "eval_runtime": 43.8913, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7481264472007751, + "learning_rate": 6.103703703703704e-05, + "loss": 1.7485, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.0348880290985107, + "eval_runtime": 43.8921, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 0.9900014996528625, + "learning_rate": 5.9851851851851855e-05, + "loss": 1.9284, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.0394678115844727, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.0921311378479004, + "learning_rate": 5.8666666666666665e-05, + "loss": 1.7713, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.057318925857544, + "eval_runtime": 43.9122, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.0986183881759644, + "learning_rate": 5.748148148148149e-05, + "loss": 1.812, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.056302309036255, + "eval_runtime": 43.9015, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.3805949687957764, + "learning_rate": 5.62962962962963e-05, + "loss": 1.8037, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.0546820163726807, + "eval_runtime": 43.8771, + "eval_samples_per_second": 22.791, + "eval_steps_per_second": 2.849, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.097198486328125, + "learning_rate": 5.511111111111112e-05, + "loss": 1.7936, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.0576555728912354, + "eval_runtime": 43.8703, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.1730059385299683, + "learning_rate": 5.392592592592593e-05, + "loss": 1.8034, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.0635132789611816, + "eval_runtime": 43.89, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.1634464263916016, + "learning_rate": 5.274074074074074e-05, + "loss": 1.7507, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.0704760551452637, + "eval_runtime": 43.8936, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.4796065092086792, + "learning_rate": 5.155555555555556e-05, + "loss": 1.6669, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.111283302307129, + "eval_runtime": 43.8949, + "eval_samples_per_second": 22.782, + "eval_steps_per_second": 2.848, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.552974820137024, + "learning_rate": 5.037037037037037e-05, + "loss": 1.7464, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.1242177486419678, + "eval_runtime": 43.9316, + "eval_samples_per_second": 22.763, + "eval_steps_per_second": 2.845, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6674655675888062, + "learning_rate": 4.918518518518519e-05, + "loss": 1.6434, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.126720666885376, + "eval_runtime": 43.9011, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.9751604795455933, + "learning_rate": 4.8e-05, + "loss": 1.633, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.129762887954712, + "eval_runtime": 43.8854, + "eval_samples_per_second": 22.787, + "eval_steps_per_second": 2.848, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.6773359775543213, + "learning_rate": 4.681481481481481e-05, + "loss": 1.6549, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.128932237625122, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.5143377780914307, + "learning_rate": 4.5629629629629636e-05, + "loss": 1.6713, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.114884376525879, + "eval_runtime": 43.8858, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.7334610223770142, + "learning_rate": 4.444444444444445e-05, + "loss": 1.6209, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.12459659576416, + "eval_runtime": 43.9019, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 1.8721027374267578, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.4866, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.193692922592163, + "eval_runtime": 43.9058, + "eval_samples_per_second": 22.776, + "eval_steps_per_second": 2.847, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.286405563354492, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.4802, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.184469223022461, + "eval_runtime": 43.9398, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 1.911645770072937, + "learning_rate": 4.088888888888889e-05, + "loss": 1.5652, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.199193000793457, + "eval_runtime": 43.9102, + "eval_samples_per_second": 22.774, + "eval_steps_per_second": 2.847, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1289539337158203, + "learning_rate": 3.970370370370371e-05, + "loss": 1.5381, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.1956825256347656, + "eval_runtime": 43.9035, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.496502637863159, + "learning_rate": 3.851851851851852e-05, + "loss": 1.6026, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.210069417953491, + "eval_runtime": 43.8995, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.1697394847869873, + "learning_rate": 3.733333333333334e-05, + "loss": 1.4914, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.1952123641967773, + "eval_runtime": 44.6778, + "eval_samples_per_second": 22.382, + "eval_steps_per_second": 2.798, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.109811544418335, + "learning_rate": 3.614814814814815e-05, + "loss": 1.5546, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.199993133544922, + "eval_runtime": 44.6864, + "eval_samples_per_second": 22.378, + "eval_steps_per_second": 2.797, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.003615617752075, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.4419, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.223910093307495, + "eval_runtime": 44.6823, + "eval_samples_per_second": 22.38, + "eval_steps_per_second": 2.798, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.304835319519043, + "learning_rate": 3.377777777777778e-05, + "loss": 1.4402, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.267995595932007, + "eval_runtime": 44.6577, + "eval_samples_per_second": 22.393, + "eval_steps_per_second": 2.799, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.4007205963134766, + "learning_rate": 3.259259259259259e-05, + "loss": 1.4117, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.267059803009033, + "eval_runtime": 44.5611, + "eval_samples_per_second": 22.441, + "eval_steps_per_second": 2.805, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.3035316467285156, + "learning_rate": 3.140740740740741e-05, + "loss": 1.4646, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.268526315689087, + "eval_runtime": 44.6155, + "eval_samples_per_second": 22.414, + "eval_steps_per_second": 2.802, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.5573678016662598, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.4416, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.272042751312256, + "eval_runtime": 44.4978, + "eval_samples_per_second": 22.473, + "eval_steps_per_second": 2.809, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.602250814437866, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.4707, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.2793080806732178, + "eval_runtime": 44.2471, + "eval_samples_per_second": 22.6, + "eval_steps_per_second": 2.825, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.553790807723999, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.4299, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.2651755809783936, + "eval_runtime": 44.3184, + "eval_samples_per_second": 22.564, + "eval_steps_per_second": 2.82, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.5743160247802734, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.3563, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.2669320106506348, + "eval_runtime": 44.4667, + "eval_samples_per_second": 22.489, + "eval_steps_per_second": 2.811, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.75224232673645, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.2773, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.321972131729126, + "eval_runtime": 44.5579, + "eval_samples_per_second": 22.443, + "eval_steps_per_second": 2.805, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 2.882275342941284, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.2917, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.330695390701294, + "eval_runtime": 44.113, + "eval_samples_per_second": 22.669, + "eval_steps_per_second": 2.834, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.4650065898895264, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.3448, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.340852737426758, + "eval_runtime": 43.8091, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 7.873574733734131, + "learning_rate": 2.192592592592593e-05, + "loss": 1.4148, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.3322806358337402, + "eval_runtime": 43.8316, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0111842155456543, + "learning_rate": 2.074074074074074e-05, + "loss": 1.3852, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.3360278606414795, + "eval_runtime": 43.8107, + "eval_samples_per_second": 22.825, + "eval_steps_per_second": 2.853, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 2.736973762512207, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.3255, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.332850933074951, + "eval_runtime": 43.8211, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 2.7843706607818604, + "learning_rate": 1.837037037037037e-05, + "loss": 1.322, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.3239428997039795, + "eval_runtime": 43.797, + "eval_samples_per_second": 22.833, + "eval_steps_per_second": 2.854, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 2.6241109371185303, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.2514, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.346761703491211, + "eval_runtime": 43.8096, + "eval_samples_per_second": 22.826, + "eval_steps_per_second": 2.853, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 2.6017918586730957, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.2596, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.3801326751708984, + "eval_runtime": 43.8136, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.0953481197357178, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.2604, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.373786211013794, + "eval_runtime": 43.8205, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.231630563735962, + "learning_rate": 1.362962962962963e-05, + "loss": 1.2677, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.3813154697418213, + "eval_runtime": 43.8661, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.228578805923462, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.3165, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.375894546508789, + "eval_runtime": 43.8615, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.699553966522217, + "learning_rate": 1.125925925925926e-05, + "loss": 1.3078, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.3867671489715576, + "eval_runtime": 43.8442, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.340555191040039, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.2713, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.3834099769592285, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.3263585567474365, + "learning_rate": 8.888888888888888e-06, + "loss": 1.2524, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.3754892349243164, + "eval_runtime": 43.8588, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.005136251449585, + "learning_rate": 7.703703703703704e-06, + "loss": 1.1771, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.401296854019165, + "eval_runtime": 43.8274, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.3590383529663086, + "learning_rate": 6.51851851851852e-06, + "loss": 1.2372, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.4095165729522705, + "eval_runtime": 43.8221, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.183739423751831, + "learning_rate": 5.333333333333334e-06, + "loss": 1.2938, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.4073400497436523, + "eval_runtime": 43.8406, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.044917106628418, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.2809, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.4101006984710693, + "eval_runtime": 43.829, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.301975727081299, + "learning_rate": 2.962962962962963e-06, + "loss": 1.2257, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.409301519393921, + "eval_runtime": 43.831, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.4970486164093018, + "learning_rate": 1.777777777777778e-06, + "loss": 1.2425, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.4106786251068115, + "eval_runtime": 43.8162, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 5.311999797821045, + "learning_rate": 5.925925925925927e-07, + "loss": 1.1569, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.4109017848968506, + "eval_runtime": 43.8901, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": true + }, + "attributes": {} + } + }, + "total_flos": 1.10605985906688e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-675/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..231b57053982ac48903bc0fd4e4ea606b14660be --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:57c454ee1947c65e46bbd01b6c883eda6c5e661347acf5fc0474d0c0d955a587 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..4c0828d244e4e54b7a009940c8f58c5cb0bf8978 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e982fb8a427c2ea0ebf9fd6c1bed8287cc375f7db10dc3fce993f2520b0e2acd +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..2b1c959e3b92a9d3847cd61e595c79a1813cfe3a --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bf8faccd3d2ca94b80304c3092e394e13d076f35c0c4f51d74490ac3412d5f9 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..edc613be9a8a7736c1c5e6c411193a18eb94121c --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:116e4caee7c9274e6f2a7d93ee5e67e259426d00592030a182ec1bf7e3e1fd99 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c3ce478fff72f077e2685ad4b2d23db4205e8be8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/trainer_state.json @@ -0,0 +1,138 @@ +{ + "best_metric": 2.0119094848632812, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70", + "epoch": 0.9333333333333333, + "eval_steps": 10, + "global_step": 70, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.14702503903232e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-70/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e006571f24a9d4a7afff9af7a91ed568a9892de3 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:204894f337f4262e6ea940df81e3c9da95f98b3eae0797156e49cb8e9739cd31 +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..482843e71aecfed54e51d5ab980b2f921311f949 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e27625c7b5beeb0f6d525943caa3160cf01fb2d7937b80fed70ed9d482eec895 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..0b228b8e8106f666fe286c5d131d496d926a7df4 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:debbe8bbbf3d0dfd719072ab48974c332b6f78ebe25ef99f5002c8d0a8c8c380 +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8bf25b5c8780313aa53c49c9a020653afda88fbe --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e54696b8c39c3b120a2b1d4d03623aee6400315f6e759074fafe42342c8bf95 +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..58d86c0cd54c94046e50c0798afb330bf3abdb93 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/trainer_state.json @@ -0,0 +1,153 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.0666666666666667, + "eval_steps": 10, + "global_step": 80, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.31088575889408e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/README.md b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_config.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..554fd0e5c0c057f67840266f7df54ae73f3880f6 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense_4h_to_h", + "query_key_value", + "dense", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_model.safetensors b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..dc088f9f7326e22401e0fae9f6bfe19bcdb1cf77 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e4c03826322bd9359baac67b6fd77bc4a48f93b90dc2efaf9643ee4aebf5414d +size 67144544 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/optimizer.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fc16d99b93e97f210ef8a29e819c41aebfc95cd2 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6d5e2a51d4b1b98898669d98d8c93097dd72a841d74048410c246fc610c509a0 +size 134432453 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/rng_state.pth b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4041231f7cc289aaec627b941b3ce1ed104a3678 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e1884689751e2c9aa53b83d7472089621e5727e27a037b479e2287c7b208b1a +size 14575 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/scheduler.pt b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d1e19095e23644fde7d19dd9320fdb8daf7fd2bd --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28209e35c6873af016e1c69801c50fdb913d066bb8fab0d3da00cafc566c1a5c +size 627 diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/trainer_state.json b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..6324ee86ee08067b0a6e1df4c6d54275239ac0b8 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/trainer_state.json @@ -0,0 +1,168 @@ +{ + "best_metric": 2.009955883026123, + "best_model_checkpoint": "./output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-80", + "epoch": 1.2, + "eval_steps": 10, + "global_step": 90, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.49085623025894165, + "learning_rate": 7.881481481481482e-05, + "loss": 1.9937, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.0317482948303223, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.46359726786613464, + "learning_rate": 7.762962962962963e-05, + "loss": 2.0265, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.0271925926208496, + "eval_runtime": 43.8912, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.6025303602218628, + "learning_rate": 7.644444444444445e-05, + "loss": 1.9621, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.0226855278015137, + "eval_runtime": 43.8927, + "eval_samples_per_second": 22.783, + "eval_steps_per_second": 2.848, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.8317577838897705, + "learning_rate": 7.525925925925926e-05, + "loss": 1.9567, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.0181334018707275, + "eval_runtime": 43.8874, + "eval_samples_per_second": 22.786, + "eval_steps_per_second": 2.848, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.4699082672595978, + "learning_rate": 7.407407407407409e-05, + "loss": 1.9675, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.0160505771636963, + "eval_runtime": 43.9047, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.4183911979198456, + "learning_rate": 7.28888888888889e-05, + "loss": 2.075, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.015014171600342, + "eval_runtime": 43.9142, + "eval_samples_per_second": 22.772, + "eval_steps_per_second": 2.846, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.36181703209877014, + "learning_rate": 7.170370370370371e-05, + "loss": 1.9579, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.0119094848632812, + "eval_runtime": 43.9414, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.40919414162635803, + "learning_rate": 7.051851851851853e-05, + "loss": 1.8443, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.009955883026123, + "eval_runtime": 43.9074, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.5300728678703308, + "learning_rate": 6.933333333333334e-05, + "loss": 1.8683, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.0114803314208984, + "eval_runtime": 43.8717, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.47474647875584e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/training_args.bin b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..682bf0e02adfb8a40aeb547745b5660a324bf954 --- /dev/null +++ b/output_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05/checkpoint-90/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e50514bd912c79375f7c29e8ddf29184fbfcebcdd686d6d38d515bb59fe86c2 +size 4859 diff --git a/responses_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05-ft.jsonl b/responses_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05-ft.jsonl new file mode 100644 index 0000000000000000000000000000000000000000..db319778b296436a3853197961a6d85d98502dae --- /dev/null +++ b/responses_ft_more_layers_uspto_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-uspto-8e-05-ft.jsonl @@ -0,0 +1,1000 @@ +{"output_text": "ively at the output of the subtractive combining circuit, and the comb filter output is a composite color video signal having a comb-like characteristic.\nA conventional Y-type comb filter is an open-loop circuit formed of a delay line with a one-horizontal-line-period delay time and a subtractive combining circuit for combining the delayed and undelayed luminance component. Because the phase of the luminance subcarrier signal is constant from one line to the next, the delayed and undelayed", "input_text": " causing the computer to store the communication control rule.\nFurther scope of applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given through illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description. 1. Field of the Invention\nThis invention relates to video signal processing circuitry for processing a composite color video signal, and is more particularly directed to a comb filter circuit suitable for processing a color video signal reproduced by means of a video tape recorder/player or VTR.\n2. Description of the Prior Art\nComb filters are commonly used for processing the luminance and chrominance components of a composite color video signal reproduced from magnetic tape or other recording medium in order to eliminate undesired signals, such as so-called crosstalk. Because of its serrated characteristic, a comb filter also aids in separation of frequency-interleaved signals, and thus can be used to eliminate remnants of chrominance components from the reproduced luminance component and to eliminate remnants of the luminance component from the reproduced chrominance component.\nGenerally, there are two types of comb filter circuits commonly used in a VTR: a chrominance, or C-type comb filter for processing the reproduced chrominance component; and a luminance, or Y-type comb filter for processing the reproduced luminance component.\nA conventional C-type comb filter is an open-loop circuit formed of a delay line with a one-horizontal-line-period delay time and a subtractive combining circuit for combining the delayed and undelayed chrominance component. Because the phase of the chrominance subcarrier signal alternates from one line to the next, the delayed and undelayed chrominance components combine construct"} +{"output_text": "-layer varistor, the varistor body is formed by a plurality of varistor layers, which are stacked one on top of the other and are separated from one another by a plurality of dielectric layers. The varistor layers are formed by a plurality of varistor layers, which are stacked one on top of the other and are separated from one another by a plurality of dielectric layers. The varistor layers are formed by a plurality of varistor layers, which are stacked one on top of the other", "input_text": "D) protection of high-frequency circuits and data lines. When an impermissibly high voltage occurs for a high-frequency circuit or data line to be protected, the spark gap between the two tips of the interconnect lying opposite one another ignites, carrying an arcing discharge, so that this impermissibly high voltage is not adjacent at the high-frequency circuit or data line.\nThe ignition of the spark gap occurs according to specific physical laws in which the gas discharge characteristic is specifically traversed. This event requires a specific time duration so that the time alone that is needed for the ionization of the spark gap is usually longer than the rise time of an ESD pulse, which can lie on the order of magnitude of 700 ps.\nIn summary, this means that, due to their inertia, spark gaps have disadvantages when used as ESD protection of high-frequency circuits or data lines.\nCompared to spark gaps, multi-layer varistors are characterized by a substantially shorter response time: the response time of multi-layer varistors are on the order or magnitude of 500 ps, which is lower by approximately a factor of 2 than the response time of spark gaps. Nonetheless, multi-layer varistors have previously not been utilized as ESD protection of high-frequency circuits or data lines, because of the laminar structure of the multi-layer varistors. This laminar structure, namely, leads to parasitic capacitances that make the use of multi-layer varistors impossible in high-frequency circuits with frequencies above 100 MHz. Such high-frequency circuits are, for example, high-frequency input circuits such as antenna inputs, etc.\nFIGS. 13 through 15 show an existing multi-layer varistor in perspective (see FIG. 13), in section (see FIG. 14) or in an overall view with inner electrodes conducted out (see FIG. 15).\nGiven this multi"} +{"output_text": " flexible tubing having a closed end and an open end. The open end is sealed circumferentially over a first end of a rigid tubular casing. The casing has a second end closed. A reservoir of liquid is provided that can be raised above and lowered below the level of the casing. The reservoir is connected by a conduit to the casing. When the reservoir is raised, relative to the casing, liquid flows into the casing, causing the tubing to evert out of the end of the casing. When the", "input_text": " reverse of the insertion process.\nU.S. Pat. No. 3,433,214 (Silverman) discloses a method and apparatus for everting a tubular probe into a body cavity. The device consists of a tubular casing, a flexible, thin-walled tubular probe inside of the casing with the inside end closed and the outside end sealed circumferentially over a first end of the casing. The second end of the casing is closed. A reservoir of liquid is provided that can be raised above and lowered below the level of the casing. The reservoir is connected by a conduit to the casing. When the reservoir is raised, relative to the casing, liquid flows into the casing, causing the tubing to evert out of the end of the casing. When the reservoir is lowered, relative to the casing, liquid flows out of the casing, causing the tubing to be inverted back into the casing.\nU.S. Pat. No. 3,502,069 (Silverman) discloses a method and apparatus for placing a tubular probe in a body cavity and for retrieving the probe from the body cavity. The instrument comprises a rigid tubular casing and a flexible eversible tubing inside the casing. One end of the tubing is closed and the other end is sealed circumferentially over an open end of the casing. The other end of the casing is closed. Means are provided to pump fluid into the casing, at a pressure higher than the pressure outside the casing, to evert the tubing out of the open end of the casing. Means are provided to pump fluid out of the casing, at a pressure lower than the pressure outside of the casing, to invert the tubing back into the casing.\nU.S. Pat. No. 3,506,011 (Silverman) discloses a medical instrument for everting a thin walled flexible tubing. In particular, there is provided a length of collapsed thin walled"} +{"output_text": " arm 420. The head 402 is attached to a flexure 430 that is attached to the suspension arm 420. The flexure 430 is attached to a load beam 440 that is attached to the suspension arm 420. The load beam 440 is attached to a base plate 450 that is attached to the suspension arm 420. The base plate 450 is attached to a voice coil motor (VCM) 460 that is attached to the suspension arm 420. The VCM 460 is attached to a yoke 470 that is", "input_text": " as a merged head. The write head may be less than 30 microns above the rapidly spinning disk and the transaction is virtually instantaneous. In future, higher density recording media may require a near-zero gap.\nFIG. 3B illustrates one embodiment for a printed circuit board for use in the head-disk assembly. A printed circuit board 400 includes multiple layers including a power plane, ground planes, and signal paths. In general the printed circuit board includes, for operation of the hard disk drive, digital circuits 356, clock 340, analog circuits 360, and control/power and line conditioning 370. A head-disk assembly (HDA) connector 330 connects power and control conductors from the printed circuit for routing to the head-disk assembly. For this embodiment, the ground plane is divided between a digital circuit ground plane 310 and an analog circuit ground plane 320. A clock 340, used to generate data to read and write data in the hard disk drive, is mounted on the printed circuit board 300 and coupled to the digital circuit ground plane 310. Similarly, digital circuits 350 that control the operations of the hard disk drive are also mounted on the printed circuit board and grounded on the digital circuit ground plane 310. Analog circuit 360, which operates on analog signals read from the head-disk assembly, is mounted on the printed circuit ground plane 320. The power and control signals from the analog circuits 360 are input to control/power line conditioning circuits 370 conditioning the power and control signals to reduce noise coupling in the actuator. The conditioned signals are then passed to the HDA connector 330.\nFIG. 4 illustrates a glide head or a downward facing merged head mounted on a suspension arm 420 and flying over the surface 424 of a rotating disk 422; disk 422 rotates in the direction of arrow 425. A linear actuator (not shown) controls the radial position of the head 402 with respect to the disk 422 by moving the suspension"} +{"output_text": " used preferably for filtering suspensions. The filter means is positioned in a cover of the centrifuge and extends perpendicular to the axis of the driving shaft. The filter means is usually a filter cloth or a filter belt.\nThe filter means is usually positioned in the cover of the centrifuge and extends perpendicular to the axis of the driving shaft. The filter means is usually a filter cloth or a filter belt.\nThe filter means is usually positioned in the cover of the centrifuge and extends perpendicular", "input_text": " inertia of the twelve pound ball would have to be approximately twenty-five percent lower than those of the sixteen pound ball.\nTypically, lightweight balls have a lower maximum moment of inertia than does a heavier ball. However, the decrease is usually not proportional to the decrease in weight and the result is that lightweight balls have different rotational dynamics than heavier balls and as a consequence, differing reaction characteristics.\nThe practice of removing weight from the core while maintaining the same core shape is to be particularly avoided. A fifteen pound ball cast with the same internal core shape typically will lose more than twenty-five percent of the differential moment of inertia of the sixteen pound version. For a 14 pound ball, the loss will be sixty-six percent. This decrease of differential moment of inertia results in the lighter balls having different rotational dynamics than heaver weight balls and therefore, different reaction characteristics.\nAccording to the present invention, proportional lowering of the moments of inertia and differential moments of inertia can be accomplished by maintaining the same radii of gyration and differential radius of gyration for all the balls in a family of balls of greatly varying weight. The present invention relates to a clarifying filter centrifuge of the type including an enclosed drum driven by a driving shaft, and a filter means positioned in a cover of the centrifuge and extending perpendicular to the axis of the driving shaft, and to a method of separating or filtering suspensions by means of said centrifuge.\nThere are two types of conventional centrifuges for filtering suspensions, namely solid jacket centrifuges and filter centrifuges.\nSolid jacket centrifuges are utilized preferably for clarifying liquids. The heavy phase is deposited and collected on the wall of the drum while the light phase of the liquid, which is also a liquid, flows through an overflow weir.\nIn filter centrifuges, the liquid flows through filter cakes and filter means. This type of centrifuge is"} +{"output_text": " the fast page mode and the normal page mode. In this case, the processor LSI is connected to two banks of DRAM chips, and the processor LSI is also connected to a cache memory. The processor LSI is also connected to a bus which is used for transferring data between the processor LSI and the cache memory.\nIn this case, the processor LSI is connected to the two banks of DRAM chips through a first bus, and the processor LSI is also connected to the cache", "input_text": " and a column address to be provided to the dynamic memory LSI in this order (see page 454), wherein read access time is 70 nanoseconds after the establishment of the externally provided address (1 nanosecond=1.times.10.sup.-9 second). Alternative to this read/write access, if a fast page mode (page 461) is used, after the first row and column addresses have been transferred, as long as second and subsequent accesses are made to the same row, transfer of the row address can be omitted, with the result that read access time required for the second and subsequent read accesses is reduced to 20 nanoseconds from the establishment of the external address.\nAn example of a DRAM control function designed for a conventional microprocessor (hereinafter simply called the \"processor\") is described in \"Hot Chips IV\", pp. 4.2.2-4.2.12, August, 1992, held in Stanford University. On page 4.2.3 of this document, a drawing is illustrated in which a processor LSI is directly connected to two banks of DRAM chips. Also, timing charts on pages 4.2.7 and 4.2.8 of this document respectively include descriptions \"Check fast page cache-hit\" and \"Check fast page cache-miss\", from which it can be predicted that the fast page mode of the dynamic memory is used under certain hit conditions within the processor. This operation would be enabled, for example, by storing a row address with which a dynamic memory has been accessed at the previous time. The above-mentioned document, however, does not at all refer to how to use two-bank DRAM's or the relation between the cache-hit of the high speed mode and the two-bank DRAM's.\nAssume now a conventional processor LSI which includes, among its terminals, dynamic memory address terminals which are used for both"} +{"output_text": " vacuum in the tank. The vacuum draws water from the reservoir into the tank. The water level in the tank rises until the valve is again opened and water is again discharged from the tank.\nThe tank described in the above patents is designed to be inverted and then removed from the humidifier base. The tank is then inverted and re-installed on the base. The tank is then filled with water and the cap is replaced on the tank. The tank is then inverted and removed from the base.", "input_text": " tank.\nVarious types of humidifiers are used to provide moisture to indoor air. Included among such humidifiers are ultrasonic humidifiers, steam humidifiers or vaporizers, and evaporative humidifiers.\nUltrasonic humidifiers employ a high-speed oscillator, positioned a given distance below the water surface, to energize the water and break it into a fine mist. A fan carries the mist into the surrounding environment. It is critical that the distance from the oscillator to the water level be accurately maintained to ensure that the oscillation energy is efficiently transferred to the water. A drop in water level can result in permanent damage to the oscillator. The water level generally is maintained by the use of an inverted water tank such as that described in U.S. Pat. Nos. 5,210,818 and 5,247,604. The tank is sealed and includes a carrying handle on its top surface while a bottom surface includes an opening to which a cap is attached. When the tank is inverted beneath a spigot and the cap is removed the opening serves as a fill opening. Often the cap includes a valve system which seals the fill opening unless the tank is properly positioned on a humidifier base and the valve is engaged by a valve actuator in the base. The valve actuator opens the valve and allows water to escape from the tank into a reservoir defined by the base. Discharging water is exchanged for air which enters the tank through the same opening. As water flows into the base reservoir, the water level rises until it seals the valve and prevents air from getting into the tank. At this level, which is the normal operating water level for the humidifier, water flow from the tank ceases. The design of the humidifier is established to position the oscillator that given distance below this level. As the oscillator and fan cause dispersal of moisture from the reservoir, the water level attempts to drop creating a"} +{"output_text": " or AT measures a signal strength of a forward channel and feeds back the measured strength to a BS or AP. The BS or AP performs scheduling using the fed-back strength information and transmits data to the MS or AT.\nIn the wireless communication system for performing packet data communication, a BS or AP transmits a reference signal to an MS or AT. The reference signal is used for channel estimation and data demodulation. The reference signal is also used for a channel quality measurement.\nIn the wireless", "input_text": " in the OFDMA system. As illustrated in FIG. 1B, it can be seen that subcarriers carrying data to be transmitted to one MS are scattered, which is different from the AMC mode of FIG. 1A. The diversity transmission is suitable for the case where a transmission of a combination of data of one user in a particular sub-band is not easy because a data transmitter cannot know a channel state. The diversity transmission is also suitable for a channel to be transmitted to unspecified users as in broadcasting.\nThe above-described OFDMA wireless communication system conventionally transmits packet data. The system for transmitting the packet data has the structure of FIG. 2. FIG. 2 is a conceptual diagram illustrating a relation between a BS or Access Point (AP) and MSs or Access Terminals (ATs) in a wireless communication system for performing packet data communication.\nReferring to FIG. 2, MSs or ATs 211, 212, 213, 214, and 215 communicate with the BS or AP 200 through a predetermined channel. The BS or AP 200 transmits a predetermined reference signal, for example, a pilot signal. The MSs or ATs 211\u02dc215 measure the strength of a signal received from the BS 200 and feed back information about the measured strength to the BS or AP 200, respectively. Thus, the BS or AP 200 performs scheduling using information about strengths of signals received from the MSs or ATs and transmits data to the MSs or ATs. In FIG. 2, the arrows from the BS or AP 200 to the MSs or ATs 211\u02dc215 are signals transmitted on forward channels and the arrows from the MSs or ATs 211\u02dc215 to the BS or AP 200 are signals transmitted on reverse channels.\nAs described with reference to FIG. 2, a mobile communication system for performing packet data communication widely employs a scheme in which an MS"} +{"output_text": " is irradiated with ultraviolet rays, such as polarized rays or non-polarized rays, to thereby form an orientation film.\nHowever, the related art roll printing method has the following problems.\nFirst, the related art roll printing method has a problem in that the alignment solution is not uniformly coated on the anilox roll.\nSecond, the related art roll printing method has a problem in that the alignment solution is not uniformly transferred onto the rubber plate.\nThird, the related art roll printing method", "input_text": " substrate, and ultraviolet rays, such as polarized rays or non-polarized rays, are irradiated onto the orientation film. A reaction resulting from the irradiation orients the orientation film in a designated direction.\nWhen using either the rubbing or the light irradiating method, a thin orientation film or alignment layer having a small thickness is uniformly deposited on a substrate. A related art roll printing method is used to deposit the orientation film.\nFIG. 6 illustrates a method for forming or coating an alignment film using a related art roll printing method and device.\nAs illustrated, generally, an alignment film is formed using a printing method using a plurality of rolls. Namely, an alignment solution 10224 supplied between a cylindrical anilox roll 10222 and a cylindrical doctor roll 10223 is uniformly coated entirely on the anilox roll 10222 as the anilox roll 10222 and the doctor roll 10223 are rotated. In this case, the alignment solution 10224 is supplied by a dispenser 1021 in an injector type.\nThe anilox roll 10222 is rotated in contact with a printing roll 10224 having a rubber plate or printing mask 10225 attached on a certain region of its surface, thereby transferring the alignment solution 10229 on the anilox roll 10222 to the rubber plate 10225. The rubber plate 10225 corresponds to a substrate 10226 on which the alignment solution or orientation material 10229 is to be coated, and has a master pattern to allow the alignment film to be selectively printed on the substrate 10226.\nAs a printing table or substrate stage 10227 with the substrate 10226 loaded thereon is moved in contact with the printing roll 10224, the alignment solution 10229 which has been transferred to the rubber plate 10225 is re-transferred onto the substrate 10226 to thereby form an alignment film.\nNext, with the alignment film formed on the substrate, the alignment film"} +{"output_text": " as a mixture of the cis- and trans-isomers. The trans-:cis- isomer ratio of the product 1,4-cyclohexanedimethanol is therefore dependent on the trans-:cis- isomer ratio of the starting material dimethyl 1,4-cyclohexanedicarboxylate.\nThe trans-:cis- isomer ratio of dimethyl 1,4-cyclohexanedicarboxylate is known to be dependent on the reaction conditions. For example", "input_text": "hexanedimethanol is also discussed in U.S. Pat. No. 2,917,549, in U.S. Pat. No. 4,999,090 and in GB-A-988316.\nA liquid phase process for the production of 1,4-cyclohexanedimethanol by plural stage hydrogenation of dimethyl terephthalate is described in U.S. Pat. No. 3,334,149. This utilises a palladium catalyst to effect hydrogenation of dimethyl terephthalate to dimethyl 1,4-cyclohexanedicarboxylate, followed by use of a copper chromite catalyst in the liquid phase to catalyse the hydrogenation of that diester to 1,4-cyclohexanedimethanol. In the procedure described in Example 1 of that patent specification a residence time of about 40 to 50 minutes is used in the second stage of this process. The activity of the copper chromite catalysts recommended in U.S. Pat. No. 3,334,149 is such that long residence times are required.\nIn a liquid phase process for the production of 1,4-cyclohexanedimethanol, such as is disclosed in U.S. Pat. No. 3,334,149, the trans-:cis- isomer ratio of the product 1,4-cyclohexanedimethanol will tend towards an equilibrium value. This equilibrium value has been reported variously and may lie between about 2.57:1 (trans-:cis- 1,4-cyclohexanedimethanol) (as reported in GB-A-988316) and about 3:1 (as reported in U.S. Pat. No. 2,917,549). However, the starting material, dimethyl 1,4-cyclohexanedicarboxylate, is generally commercially obtainable"} +{"output_text": " is relatively heavy, weighing about 2.5 lbs. or more. The winch 10 is also relatively large and bulky, and is not easily stored or transported. The winch 10 is also relatively expensive to manufacture and assemble.\nAccordingly, there is a need for a winch that is lighter in weight, less bulky, and easier to store and transport. There is also a need for a winch that is less expensive to manufacture and assemble. There is also a need for a winch", "input_text": " a slot 26a in which a cargo-retaining strap S is inserted and then wound around the spool for storage/use. The strap S is payed-out from the spool 26 as needed by counter-clockwise rotation of the spool, and retracted as needed by clockwise rotation of the spool 26. The spool 26 includes a driving head (not shown) that projects outwardly from sidewall 24b and that is engaged by a winch bar or other tool to rotate the spool. A ratchet wheel 28 is welded to the spool 26 and rotates therewith adjacent an outer face of sidewall 24a. A pawl 30 is pivotally secured to the sidewall 24a by a bolt, pin or other fastener 32 and pivots between a first position, as shown, where it engages the ratchet wheel 28 and prevents counter-clockwise rotation of the ratchet wheel 28 and spool 26 but allows clockwise rotation for strap tightening operations, and a second position, where it is disengaged from the ratchet wheel 28 to allow free rotation of the ratchet wheel 28 and spool 26 in either direction. The pawl 30 is normally positioned in its first position, as shown, by force of gravity and/or a biasing spring. The base 22 of the winch 10 can also be configured to mate slidably with a flanged side-rail of a cargo trailer or cargo bed, e.g., of a \u201cflat-bed\u201d trailer.\nThe winch 10 shown in FIG. 1 has been found to be sub-optimal for a variety of reasons. The frame 20, spool 26, ratchet wheel 28 and pawl 30 are defined from ferrous steel and are susceptible to corrosion and can weigh as much as 9 pounds (lbs.) or more. Furthermore, the frame 20, itself,"} +{"output_text": " the inner surface of the rotor yoke 62, and a shaft 64 inserted into the rotor yoke 62. The stator yoke 58 is formed by laminating electromagnetic steel plates, and has a plurality of slots 65 formed in the stator yoke 58. The magnetic field coil 59 is formed by winding a wire around the outer periphery of the stator yoke 58. The magnetic field coil 59 is disposed in the slots 65 of the stator yoke 58.\nThe above coreless type motor 60 is", "input_text": " appended claims, it should be clearly understood that changes in the precise embodiments of the herein disclosed invention are meant to be included within the scope of the claims, except insofar as they may be precluded by the prior art. Conventional brushless DC motors known include a core type motor shown in FIG. 50 and a coreless type motor as shown in FIG. 51 to FIG. 57.\nAs shown in FIG. 50, the above core type motor 52 has a stator yoke 53 formed by layering electromagnetic steel plates which are punched out into a certain form, and has a magnetic field coil 54 disposed in a slot section of the stator yoke 53 in the insulated state. A rotor magnet 56 is disposed on a rotor 55. And, a motor section comprises the stator yoke 53 and the rotor 55, and a circuit unit section 57 consists of a circuit element and the like.\nThe above coreless type motor 60 has a magnetic field coil 59, which is formed in the shape of a ring by a self-fusing line, disposed in an air-core type stator yoke 58 as shown in FIG. 51. The magnetic field coil 59 is formed as shown in, for example, FIG. 52 and FIG. 53. And, a rotor 61 is provided with a rotor magnet 63. And, the stator yoke 58 and the rotor 61 form a motor section, and a circuit unit 76 consists of a circuit element and the like.\nIn the motors 51 and 60 having the above structures, assembling can be made by fitting from one direction of a motor bearing supporter (boss).\nThe coreless type motor 60 is known described in, for example, Japanese Patent Application Laid-open Print No. 23754/1989. The brushless DC motor 60 of this type has a rotor 61 consisted of a cup-shaped rotor yoke 62, a ring-shaped rotor magnet 63 adhered to"} +{"output_text": " substrate) and the CF substrate (or, alternatively, TFT substrate) is sealed.\nHowever, the liquid crystal dropping method has a problem in that the liquid crystal is likely to be dropped onto the TFT substrate outside the main sealing area.\nIn addition, the liquid crystal dropping method has a problem in that the liquid crystal is likely to be dropped onto the CF substrate outside the main sealing area.\nIn addition, the liquid crystal dropping method has a problem in that the liquid crystal is likely to", "input_text": " is a process for assembling a driving circuit part for processing input and output signals, connecting the liquid crystal panel to a signal processor, and assembling some frames, thereby completing the liquid crystal module.\nThe step of filling liquid crystal into the liquid crystal cell in the liquid crystal cell process step can be explained as follows.\nIn the liquid crystal filling step, a liquid crystal material is contained in a container disposed in a chamber. The chamber is maintained in a vacuum state for removing moisture and air dissolved in the liquid crystal material or contained inside the container. While maintaining the vacuum state of the chamber, a liquid crystal filling hole in the empty liquid crystal cell is dipped in the container, and brought into contact with the liquid crystal material. Then, the chamber is vented from a higher vacuum state to a lower vacuum state, and eventually to the atmospheric pressure state. Accordingly, the liquid crystal material is filled into the empty liquid crystal cell through the liquid crystal filling hole by a pressure difference between a pressure in the liquid crystal cell and a pressure in the chamber.\nHowever, the above described liquid crystal filling method has poor productivity because the method needs long time for the liquid crystal filling. That is, before the liquid crystal material is filled into the liquid crystal cell, the large assembled panel must be cut into unit panels, a portion of the unit panel must be dipped into the container, and the liquid crystal filling hole must be brought into contact with the liquid crystal material while the chamber is kept at a vacuum state. Moreover, a large sized LCD is likely to have some defects coming from imperfect filling of the liquid crystal material into the cell.\nWith regard to this, a liquid crystal dropping method has been developed in which a fixed amount of the liquid crystal is dropped onto an inner surface of the TFT substrate in a corresponding area on the TFT substrate inside a main sealing area formed around the CF substrate (or, alternatively, TFT"} +{"output_text": " Patent Application Laid-Open No. H10-290198\nPatent document 2: Japanese Patent Application Laid-Open No. H10-290199\nPatent document 3: Japanese Patent Application Laid-Open No. H10-290197\nPatent document 4: Japanese Patent Application Laid-Open No. H10-290196\nPatent document 5: Japanese Patent Application Laid-Open No. H10-290195\nPatent document 6: Japanese Patent Application Laid-Open", "input_text": " washed away even though water penetrates into the bearing and make it difficult for rust to be generated even though a solution of salt penetrates into the bearing to maintain the lubricating properties of the bearing and the like for a long time (patent document 8).\nThe rolling bearing for use in the food machine is known in which the solid lubricant, for the food machine, which is not washed away with water and withstands a successive use at a high temperature higher than 150\u00b0 C. is enclosed in the bearing to make it difficult for rust to be generated in a condition in which the bearing contacts a solution of salt (patent documents 9 and 10).\nIn these rolling bearings for the food machine, although the solid lubricant is not washed away with water, the solid lubricant is produced by the method of kneading the resin and the lubricant in advance to obtain the greasy resin and thereafter calcining the mixture, with the resin enclosed in the bearing. Thus even in the combination of the resin and the grease both of which can be used at a high temperature as disclosed, the resin is calcined at a high temperature. Therefore there is a possibility that the lubricating oil deteriorates while the resin is being calcined. Consequently in putting the bearing to practical use, there may be a case in which restrictions are imposed on the combination of the resin and the grease both of which can be used at a high temperature. Thus the solid lubricant has a problem that the degree of freedom in the combination of the resin and the grease suitable for a use is low (patent documents 8 to 10). Further to prevent the bearing from being rusted, a large amount of the solid lubricant is increased in the bearing. Therefore the bearing has a problem that it has a high torque (patent document 8) until the solid lubricant is compatible with the bearing.\nPatent document 1: Japanese"} +{"output_text": " of the surface that can be approximated by a flat geometry is called a patch. The smallest patch that can be approximated by a flat geometry is called a patch. The smallest patch that can be approximated by a flat geometry is called a patch. The smallest patch that can be approximated by a flat geometry is called a patch. The smallest patch that can be approximated by a flat geometry is called a patch. The smallest patch that can be approximated by a flat geometry is called a patch. The smallest patch that", "input_text": " parametric surfaces and topological data including a set of faces each defined as a portion of the 2D domain of a respective parametric surface.\nOnce (or while being) defined by designers, such surfaces often need to be rendered, e.g. in order to be displayed. Systems offering rendering functionality may indeed for example display surfaces e.g. during the modeling or for the purpose of review/modification by designers. Today, a computer's Graphic Processor Unit (GPU) can display thousands of triangles really efficiently. However, a GPU cannot display a surface provided as a B-Rep directly. Such surfaces must be transformed into a set of primitives that can be handled by the GPU. This is also the case for geometric operators (i.e. operators performing geometric or Boolean operations on several surfaces, e.g. for collision tests). Such transformation is called tessellation. Known tessellation methods approximate a surface by covering said surface with a pattern of flat polygons, usually triangles, with no gaps or overlaps, so as to fit as best as possible the mathematical definition of that surface, and sometimes by associating normal vectors to the pattern of flat polygons equal to corresponding surface normal vectors (i.e. vectors normal to the initial surface at corresponding positions) for the purpose of surface shading.\nThus, a surface of a modeled object can be processed via two different models. The first model is the exact model, which stands for the mathematical definition of surfaces. With the exact model, a user can design the surface and apply, on the surface, operators such as trim, bevel, or fillet operators. The second model is an approximated model, which is a geometric representation of the exact model usually used for visualization and geometric operations. Approximating smooth surfaces with flat geometry leads to discretization problems. This is a well-known problem in CAD applications. The smallest subset"} +{"output_text": " those shown in FIG. 1. The hysteresis curve of FIG. 1 is a plot of the polarization of a ferroelectric capacitor as a function of the applied voltage. The hysteresis curve of FIG. 1 is a plot of the polarization of a ferroelectric capacitor as a function of the applied voltage. The hysteresis curve of FIG. 1 is a plot of the polarization of a ferroelectric capacitor as a function of the applied voltage. The hysteresis curve of FIG. 1 is a plot of", "input_text": " seen that a negative potential can be used to change the polarization of a capacitor from point D to point A. Therefore, points A and D can represent two logic states occurring when zero volts are applied to the capacitor and which depend upon the history of voltage applied to the capacitor.\nThe reading of the polarization of the ferroelectric capacitor can be a destructive read in which a pulse is applied to the ferroelectric capacitor and the amount of resultant charge is either low if the pulse polarity agreed with the previous memorization polarity, or the resultant charge is higher if the charge polarity placed on the capacitor is of the opposite polarity last placed across the plates of the capacitor. This minute difference between an agreeable charge and an opposite charge can be measured to determine what the previous polarization on the ferroelectric capacitor was as it was last written. If a large charge results from reading a memory cell, the memory cell polarization will move from one state to the other state, for example point A to Point D. Thus, the data read from the memory cell must be restored.\nThe fact that the ferroelectric capacitors require a destructive read to determine the last polarization, and the fact that the resultant charge differences of the ferroelectric capacitor between an agreeable applied pulse and an opposite applied pulse make the technique of reading and writing ferroelectric memories a difficult task. The benefit of having a nonvolatile memory in which stored data remains without any battery backup or other external application of power is of great use in the computer and control industries. However, for any such nonvolatile memories to be of any use, the memories must be of a high enough density and must have a fast enough response time to make them commercially more attractive than battery backed up DRAM, mechanical disk storage and other types of nonvolatile storage.\nOne of the shortcomings of the prior art is the fact that the ferroelectric capacitors age through use, producing distinctly nonlinear hysteresis curves such as"} +{"output_text": " improving the sensitivity and the resolution of a resist composition. The present invention has been accomplished on the basis of this finding.\nAn object of the present invention is to provide a resist composition which is excellent in sensitivity and resolution and which is capable of forming a pattern having a high aspect ratio, and a method for forming a pattern using the same.\nAnother object of the present invention is to provide a method for forming a pattern using the above-described resist composition.\nStill another object of the present", "input_text": "ylic acid, 2-amino-4-nitrophenol, and triazine compounds such as 2-(p-chlorophenyl)-4,6-trichloromethyl-s-triazine. Among them, pyrrolidone, N-methylpyrrolidone, o-aminobenzoic acid, m-aminobenzoic acid, p-aminobenzoic acid and 1,2-phenylenediamine are typical examples.\nAlthough the above-described nitrogen-containing compounds can ease the T-top problem at an acid dissociation constant pKa ranging from 2 to 6, they cannot control reaction, that is, acid diffusion upon use of a highly-reactive acid-labile group.\nWhen a weak base is added, a dark reaction in PED proceeds at an unexposed portion, thereby causing a reduction in a line size (slimming) and a decrease in film thickness on the line surface. Addition of a strong base having a pKa of 7 or greater is effective for overcoming the above-described problem.\nHowever, higher pKa does not always bring about good results. Even when a superstrong base such as DBU (1,8-diazabicyclo[5,4,0]-7-undecene) or DBN (1,5-diazabicyclo[4,3,0]-5-nonen), proton sponge or a quaternary amine such as tetramethylammonium hydroxide is added, a sufficient effect is not available.\nThe present inventors have carried out various investigations. As a result, it has been found that amines represented by formulas (I) to (III) and (1) to (4) having a carbonyl group, an ester group, or a carbonate group are highly effective for preventing a decrease in the thickness of a resist film and also for"} +{"output_text": ". Cell Res., 218:1-9; Kimura, H. (1993) Proc. Natl Acad. Sci. USA, 90:2165-9; Kimura, H. (1994) J. Biol. Chem., 269:15175-15180; Kimura, H. (1995) J. Biol. Chem., 270:15175-15180; Kimura, H. (1996) J. Biol. Chem., 271:15175-15180", "input_text": " resulting biological response(s) (Fantl, et al., (1993) Ann. Rev. Biochem., 62:453-81; Klint, et al., (1999) Frontiers in Bioscience4: D165-77). Concomitantly, the ligand is internalized and subjected to degradation or other alternative fates (Cuatrecasas, (1982) Epidermal growth factor: uptake and fate. Ciba Foundation Symposium, 96-108; Lewis, et al., (1996)Exp. Eye Res., 62:309-24; Massagu, etal., (1986) J. Cell. Phys., 128:216-22; Naka, et al., (1993) Febs Letters, 329:147-52; Sorkin, et al., (1988) Exp. Cell Res., 175:192-205). However, mounting evidence for a number of growth factors and cytokines (FGF, nerve growth factor, PDGF, Schwannoma-derived growth factor, insulin, angiotensin 11 and growth hormone) suggest that they may act intracellularly and in many cases support a site of action for these factors in the nucleus (Jans, et al., (1998) Bioessays, 20:400-11; Prochiantz, et al., (1995) Bioessays, 17:39-44; Imamura, et al., (1990) Science, 249:1567-1570; Kimura, H. (1993) Proc. Natl Acad. Sci. USA, 90:2165-9). This has been extensively documented for the FGF family (Imamura, et al., (1990) Science, 249:1567-1570; Baldin, et al., (1990) EMBO J., 9:1511-1517; Imamura, et al., (1994) Exp"} +{"output_text": " path between the regions. This current path is referred to as a \"latch-up\" condition.\nLatch-up is a serious problem in CMOS circuits. Latch-up can occur in a variety of circuit configurations, including those which include bipolar transistors. In CMOS circuits, latch-up is a particular problem in CMOS circuits which include both N- and P-channel transistors. Latch-up can occur in CMOS circuits which include both N- and P-channel transistors when the", "input_text": " can be unseated and the circulation valve can be opened. Drilling may then resume.\nAn advantage of the present invention includes use of the pressure and resistivity sensors with the MWD system, to allow for real time data transmission of those measurements. Another advantage is that the present invention allows obtaining static pressures, pressure build-ups, and pressure draw-downs with the work string, such as a drill string, in place. Computation of permeability and other reservoir parameters based on the pressure measurements can be accomplished without pulling the drill string.\nThe packers can be set multiple times, so that testing of several zones is possible. By making-measurement of the down hole conditions possible -in real time, optimum drilling fluid conditions can be determined which will aid in hole cleaning, drilling safety, and drilling speed. When an influx of reservoir fluid and gas enter the well borehole, the high pressure is contained within the lower part of the well borehole, significantly reducing risk of being exposed to these pressures at surface. Also, by shutting-in the well borehole immediately above the critical zone, the volume of the influx into the well borehole is significantly reduced.\nThe novel features of this invention, as well as the invention itself, will be best understood from the attached drawings, taken along with the following description in which similar reference characters refer to similar parts, and in which: The invention relates to 3D integrated circuits, and more particularly to structures and methods for suppressing latch-up and noise coupling.\nA typical CMOS circuit includes N- and P-type regions arranged to form planar or multi-gate MOS transistors. Regions of opposite conductivity types which are adjacent each other typically form parasitic pn junctions and bipolar transistor structures. While usually reverse-biased, conditions can occur in which these structures become forward biased. When this occurs, a positive feedback loop ensues which provides a low resistance current"} +{"output_text": " of the packer. The packer is set in the well by setting down weight on the packer to close the inflation ports and open the treating ports. The packer is then released from the weight and the treating ports are closed and the inflation ports are opened by picking up weight from the packer. The packer is then set in the well by setting down weight on the packer to close the treating ports and open the inflation ports. The packer is then released from the weight and", "input_text": " production tubing and set in production casing below the production tubing. A Tam International advertising brochure entitled \"Tam-J.TM. Inflatable Workover/Testing Packers And Accessories Ordering Guide\" dated January, 1986, indicates at page 5 thereof under the heading \"Coil-Tubing Operations\" that smaller diameter Tam-J.TM. packers can be utilized on continuous coil tubing by removing the lugs from the J-slot mechanism and allowing the tool to be set, released and reset with straight up and down movement of the coil tubing. Thus, the J-slot mechanism is in effect eliminated from this straddle packer apparatus when it is utilized with coil tubing, which cannot be rotated.\nAll of the devices discussed above which are designed to be run on coiled tubing down through production tubing and then set in production casing are limited in their operating flexibility since they only have two operating positions which are achieved by either setting down weight or picking up weight. These tools are run into the well with their inflating ports in an open position, and after being located at the appropriate elevation in the well, the packers are inflated to seal them against the casing. Weight is then set down on the packers to close the inflation ports and open a treating port between the packers. Subsequently, weight is picked up from the apparatus to close the treating ports and reopen the inflation ports thus allowing the packers to deflate.\nU.S. Pat. No. 4,962,815 to Schultz et al., assigned to the assignee of the present invention, discloses an improved straddle packer apparatus designed to be lowered on coiled tubing down through production tubing and then set in production casing located below the production tubing. A lug and endless J-slot mechanism in this packer provides more than two different operating positions of the tool in response to simple vertical reciprocation"} +{"output_text": " (1997) present a case report of a patient with a fear of flying who was treated with virtual reality exposure therapy. The patient was exposed to a virtual environment of a plane flying over a city. The patient was able to tolerate the virtual environment and was able to tolerate the plane flying over the city. The authors state that virtual reality exposure therapy is a useful treatment for phobias. However, the authors do not provide any theoretical rationale for conducting virtual reality exposure therapy.\nA case report of", "input_text": " care for this condition is cognitive-behavior therapy. Distorted thinking significantly contributes to phobic symptoms. A phobia of heights involves the interaction of thinking, behavior, and physiological arousal. Some have correctly diagnosed or evaluated the condition of acrophobia, yet proposed to treat it by exposure to a virtual environment. However, it is not the subjective evaluation that causes anxiety. There is an interaction between thinking, behavior, and physiology that contributes to anxiety. A subjective evaluation may lead to fear, which is different than anxiety. Fear is a thought. Anxiety is a physiological state. Danger expectations may produce fear whereas anxiety expectations may produce physiological arousal (anxiety). So, mere exposure to real or virtual environments is not enough to treat the condition.\nA comprehensive theoretical and clinical discussion of fear, anxiety, panic, and acrophobia can be found in Virtual Therapy (Lamson, 1997). Prior studies exposed participants to virtual environments where the opportunity to perceive height and depth occurred. However, the method of treatment was not adequately explained and there was no theoretical or clinical rationale for exposure therapy. It differs from Virtual Therapy (Lamson, 1997) which describes a system of therapy for the treatment of acrophobia and other psychiatric conditions.\nCarlin et al. (1997) present a case report to demonstrate the use of immersive computer generated virtual reality (vr) and mixed reality (touching real objects seen in virtual reality) for the treatment of spider phobia. A patient was exposed to virtual spider scenes over 12 weeks with each session lasting a total of 50 minutes. Exposure to virtual reality spiders produced reduction in anxiety with some symptom relief. The case is difficult to assess because of apparent co-existing obsessive-compulsive difficulties. The authors define their intervention as virtual reality exposure therapy. However, no theoretical rationale for conducting 12 treatment sessions with the patient was discussed.\nNorth et al."} +{"output_text": ", and Toshiba T-SHIELD.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer wiring structure.\nIn recent years, with the progress of the high integration of semiconductor devices, the wiring width and the wiring pitch have been reduced. In order to cope with this, a multilayer wiring structure has been adopted. In the multilayer wiring structure, a plurality of wiring layers are", "input_text": "000 Cts s\u22121 cm\u22121 volume), resolution (7 to 12 mm FWHM), size, and cost.\nPresent SPECT systems mainly use the rotating Anger camera. Many different variations of the Anger camera and other smaller size rotating single or dual instruments have been designed and used. Most of the commercial instruments use NaI(Tl), CsI(Tl), CsF, BaF2, BGO and other related crystal detectors. The majority of the commercial instruments use the Anger cameras made of NaI(Tl) crystals. All commercial SPECT instruments use collimators for determination of the direction of the incident gamma rays. The main types are parallel and converging collimators. The converging fan or cone beam collimators produce higher sensitivity but increase the complexity of the data analysis. Pinhole and slit collimators are also used. The collimators for high resolution systems eliminate about 99.9 percent of the incident gamma rays. A typical collimator hole has an area of about 1 square millimeter and a length of 1.9 centimeters. Increasing collimator resolution decreases sensitivity and vice versa. Collimators made of high atomic number materials such as lead which also produce considerable amounts of scattered gamma rays on the inside surface of the collimator, thereby increasing the scattered photon background.\nAnger cameras are normally rotated on a gantry around the patient for about 20 minutes to acquire sufficient data for a reasonable image. The spatial resolutions are limited to about 7 to 12 millimeters although spatial resolutions are expected to reach 6 millimeters in the near future. The best energy resolution at gamma ray energies is about 10 percent, limiting the ability of Anger cameras to discriminate scattered photon background. Commercially available SPECT systems include ADAC ARC, GE Starcam, Elscint APEX, Trionix Triad"} +{"output_text": " shielding property, and therefore, it is possible to prevent the heat ray from entering into the interlayer film, thereby it is possible to prevent the interlayer film from being heated.\nThe amount of tin-doped indium oxide and/or antimony-doped tin oxide contained in the above-mentioned adhesive resin is preferably about 0.1 to 10 wt %, more preferably about 0.5 to 5 wt %, and even preferably about 1 to 3 wt % or so.\nWhen the", "input_text": ", and excellent transparency and weather resistance is realized, and in addition, polyvinylbutyral resin itself is easily produced.\nPolyvinylbutyral resin obtained by hereinabove method consists of vinylbutyral, vinylalcohol and vinylacetate components.\nThe amount of each component mentioned above-can be determined according to, for example, JIS K-6728 xe2x80x9cMethods for testing polyvinylbutyralxe2x80x9d or infrared absorption spectrum (IR).\nIn the case of polyvinylacetal resin other than polyvinylbutyral resin, measuring the amount of vinylalcohol and vinylacetate components is the first place, then the amount of vinylacetal can be calculated by subtracting the sum of the above-mentioned two components from 100.\nThe average butyralization degree of above-mentioned polyvinylbutyral resin is not specifically limited, but is preferably about 60 to 75 mol % or so and even preferably about 62 to 72 mol % or so.\nWhen the average butyralization degree of polyvinylbutyral resin is less than 60 mol %, solubility with plasticizer mentioned later may be lowered, thereby it may be difficult to mix polyvinylbutyral resin with plasticizer of the necessary amount to obtain penetration resistance. On the other hand, when the average butyralization degree of polyvinylbutyral resin exceeds about 75 mol %, it may fail to obtain dynamic property necessary to obtain penetration resistance.\nFor the interlayer film of the present invention, it is necessary to contain tin-doped indium oxide and/or antimony-doped tin oxide in the above-mentioned adhesive resin to give heat insulation to the interlayer film.\nNamely, tin-doped indium oxide and/or antimony-doped tin oxide has an excellent infrared ray (heat ray)"} +{"output_text": " prostate and the cryogenic fluid is circulated through the probes to freeze the prostate. The probes are then withdrawn from the prostate and the prostate is thawed.\nThe use of cryosurgical probes for cryoablation of the prostate is described in, for example, U.S. Pat. No. 5,931,868 to Onik, et al. and U.S. Pat. No. 5,931,869 to Onik, et al. The", "input_text": " the invention, the PSMM generated by the mobile station carries pilot strength information that was not used to generate the PSMM. 1. Field of the Invention\nThe present invention relates to urological warming and cooling devices and more particularly to a warming catheter and method of warming the urethra of a patient during ablative surgery. The apparatus is particularly useful in cryosurgery to prevent damage to tissues surrounding a surgical site from the extremely cold temperatures employed therein. The apparatus is especially useful during transperineal cryoablation of the prostate gland in human males to maintain the temperature of the urethral tissues and thereby prevent urethral sloughing. The apparatus may also have utility where it is desired to lower the temperature of surrounding tissues, such as during laser ablation.\n2. Description of the Related Art\nCryosurgical probes are used to treat a variety of diseases. The cryosurgical probes quickly freeze diseased body tissue, causing the tissue to die after which it will be absorbed by the body, expelled by the body or sloughed off. Cryothermal treatment is currently used to treat prostate cancer and benign prostate disease, breast tumors and breast cancer, liver tumors and liver cancer, glaucoma and other eye diseases. Cryosurgery is also proposed for the treatment of a number of other diseases.\nThe use of cryosurgical probes for cryoablation of the prostate is described in, for example, Onik, Ultrasound-Guided Cryosurgery, Scientific American at 62 (January 1996). Cryosurgical probe systems are manufactured by present assignee, Endocare, Inc. of Irvine, Calif. In cryosurgical ablation procedures generally several cryosurgical probes are inserted through the skin in the perineal area (between the scrotum and the anus), which provides the easiest access to the prostate. The probes are pushed into the"} +{"output_text": " Y3. The luminance stimulus Y3 is a value of luminance in the tristimulus values of color (light) received by the viewer's eyes when R1, G1, and B1 are inputted to the image display means 3. The luminance stimulus Y3 is a value of luminance in the tristimulus values of color (light) received by the viewer's eyes when R1=100, G1=100, and B1=100 (when displaying white).", "input_text": " the image display means 3.\nConsider now the instance that there is no influence of external light, by referring to FIG. 31. If there is no influence of external light, X2=Y2=Z2=0. When the maximum values of image data R1, G1, and B1, i.e., 100, 100, and 100, are inputted to the image display means 3, the tristimulus values of color (light) received by the viewer's eyes are X1=96.05, Y1=101, and Z1=109.9, in a situation where there is no influence of external light. On the other hand, when the minimum values of image data R1, G1, and B1, i.e., 0, 0, and 0, are inputted to the image display means 3, the tristimulus values of color (light) received by the viewer's eyes are X1=1, Y1=1, and Z1=1, in a situation where there is no influence of external light.\nIn FIG. 31, the ratio of Y3 that corresponds to luminance in the tristimulus values of color (light) received by the viewer's eyes when R1, G1, and B1 are inputted to the image display means 3, to Y3 when R1=100, G1=100, and B1=100 (when displaying white), is indicated as a ratio to white (Y/Ymax). The viewer seems that the image displayed on the image display means 3 has a larger contrast and more excellent visibility as the value of ratio to white is smaller to each image data.\nFIG. 32 is a graph showing the relationship between image data R1, G1, and B1 inputted to the image display means 3, and a luminance stimulus"} +{"output_text": " and the movable traverse with the knurling head is moved forward to the next profile element to be shaped.\nThe knurling head is provided with a device for feeding the knurl rollers. This device is a feed screw which is driven by a motor. The motor is controlled by a control unit. The control unit is provided with a memory for storing the profile elements to be shaped. The memory is provided with a read-out device for reading the profile elements to be shaped. The control", "input_text": " fixed support member actuates the base of a respective tapered element. The bar actuates this tapered element to move it in the axial direction relative to the centers.\nThe knurling head has a housing provided with radial slots. Each of these slots accommodates a slider with the knurl roller and the tapered element of said device associated kinematically therewith and installed between the bottom of the slot and the slider.\nWhen the tapered element moves its inclined surface actuates the inclined surface of the slider, thus moving said slider with the knurl roller in the radial slot towards the work piece at a distance equal to a preset value of the radial feed of the knurl rollers.\nAfter the knurl rollers are fed radially, one progressive motion of the movable support members with the knurling head is performed. In the course of this motion of the movable support member with the knurling head the knurl rollers start processing the workpiece thus shaping the profile elements to a depth relative to the value of the first radial feed of the knurl rollers. This motion is a working travel. In the course of the backward travel of the movable traverse with the knurling head the knurl rollers start processing the work piece only in the range of the resilient deformations and therefore this motion of the movable support member is an idle travel. These reciprocations of the movable support member repeat until the profile elements are fully shaped. At the end of each backward travel of the movable support member with the knurling head whose length increases constantly the bars actuate the tapered elements whereas said bars move the tapered elements to a distance relative to the value of the radial feed of the knurl rollers for shaping the profile at a working travel of the movable support member with the knurling head.\nWhen the profile shaping cycle has been finished the movable support member with the knurling head is stopped"} +{"output_text": "", "input_text": " TS methylates deoxyuridine monophosphate to produce deoxythymidylate, providing a unique de novo source of thymidylate.\nPart of the large inter-individual variability in the response to methotrexate is related to common polymorphisms in genes implicated in methotrexate pharmacokinetics or pharmacodynamics (Relling and Dervieux, Nat. Rev. Cancer, 1:99-108 (2001)). Recently, a G to A transition in exon 1 (position 80) of RFC-1, resulting in an arginine to histidine substitution at codon 27, was identified (Chango et al., Mol. Genet. Metab., 70:310-315 (2000)). However, the functional consequence of this polymorphism on methotrexate transport has remained unclear (Whetstine et al., Clin. Cancer Res., 7:3416-3422 (2001); Laverdiere et al., Blood, 100:3832-3834 (2002)). Moreover, a recent study of children with acute lymphoblastic leukemia has suggested that the A variant may be associated with poor clinical outcomes as compared with patients having the G/G genotype; individuals carrying the A/A genotype presented higher plasma concentrations of methotrexate compared to those with the G/G or G/A genotypes (Laverdiere et al., supra, (2002)).\nBecause individual differences in pharmacokinetic and pharmacodynamic parameters can be difficult to predict and because patient genotype affects these parameters, methotrexate treatment can be rendered safer and more effective through patient genotyping. Thus, there exists a need for novel correlations between patient genotypes and efficacy of methotrexate therapy. There also exists a need for new methods of determining or optimizing the efficacy of methotrexate therapy by determining MTXPG levels in a patient through genotyping. The present invention satisfies these needs and provides related advantages as well."} +{"output_text": " has a first electrode connected to the driving transistor, and the second pixel has a second electrode connected to the driving transistor; wherein the first pixel has a first auxiliary layer formed between the first electrode and the one or more light-emitting elements, and the second pixel has a second auxiliary layer formed between the second electrode and the one or more light-emitting elements; wherein the first pixel has a first auxiliary layer formed between the first electrode and the one or more light-emitting elements, and the second pixel", "input_text": " comprises an auxiliary layer formed between the third pixel's first electrode and the third pixel's one or more light-emitting elements or between the third pixel's one or more light-emitting elements and the third pixel's second electrode.\nIn some embodiments, the light emitting member of the first pixel is thicker than the light emitting members of the second and third pixels.\nIn some embodiments, the light emitting members of the second pixel and the third pixel have substantially the same thickness.\nIn some embodiments, the first, second, and third pixels have substantially equal areas.\nIn some embodiments, the first, second, and third pixels each include a driving transistor connected to the respective pixel's first electrode, and the OLED display further comprises circuitry for providing a first driving voltage to the driving transistor of the first pixel and for providing a second driving voltage different from the first driving voltage to the driving transistors of the second and third pixels.\nIn some embodiments, the circuitry comprises: a first driving voltage line connected to the driving transistor of the first pixel for transmitting the first driving voltage; and a second driving voltage line connected to the driving transistors of the second and third pixels for transmitting the second driving voltage.\nIn some embodiments, the first, second, and third pixels each include a driving transistor connected to the respective pixel's first electrode, and the OLED display further comprises circuitry for generating data signals in response to luminance levels and supplying the data signals to the driving transistors, wherein for any luminance level, the corresponding data signal for the driving transistor of the first pixel has a larger voltage than the corresponding data signal for the driving transistor of each of the second and third pixels.\nSome embodiments include an OLED display comprising: a first pixel and a second pixel, wherein each of the first and second pixels comprises: a driving transistor; and one or more light-emitting elements connected to the driving transistor; wherein the first pixel"} +{"output_text": " manufactured to shape and that are configured to be worn by a user. The headgear segments include a first portion and a second portion. The first portion is configured to be worn on the user's head and includes a first portion of a first shape and a second portion of a second shape. The second portion is configured to be worn on the user's head and includes a first portion of the first shape and a second portion of the second shape. The first portion of the first shape is configured to", "input_text": " portion of the first area.\nU.S. Patent No. 20100258133 for \u201cFace mask\u201d by inventors Todd et al., filed Nov. 11, 2008 and published Oct. 14, 2010, is directed to a mask assembly for delivering gas to a patient that includes a mask body and a breathing circuit interface. The mask body includes an opening for reception of the gas and includes a seal structure for sealingly engaging with the face of the patient and surrounding at least the nose and mouth of the patient. The breathing circuit interface includes a first portion rotatably connected with the mask body and a second portion that is constructed and arranged to releasably connect with a conduit for delivering the gas to the patient through the opening.\nU.S. Publication No. 20080060648 for \u201cStability Medical Mask\u201d by inventors Thornton et al., filed Sep. 11, 2007 and published Mar. 13, 2008, is directed to a medical mask including a rigid sealing portion configured to cover and seal around at least a portion of a user's nose including the user's nostrils and a rigid stabilizing frame coupled to the rigid sealing portion. The rigid stabilizing frame includes a generally horizontal upper support member configured to bear against the user's forehead, a generally vertical support member coupled between the rigid sealing portion and the upper support member, and lower left and right support members coupled between the rigid sealing portion and the upper support member and configured to bear against the user's cheeks. The rigid stabilizing frame defines two openings configured to allow the user to see through the medical mask when the medical mask is positioned on the user's face.\nWIPO Publication No. WO2013026091 for \u201cManufactured to shape headgear and masks\u201d by inventors Dunn et al., filed Aug. 21, 2012 and published Feb. 28, 2013, is directed to a headgear or headgear segments that are"} +{"output_text": " used by the cache 34 to determine whether the content is stale or not. If the content is stale, the cache 34 will send a request to the Web site for the content. If the content is not stale, the cache 34 will send a request to the Web site for the content. If the content is stale, the cache 34 will send a request to the Web site for the content. If the content is not stale, the cache 34 will send a request to the Web site for the content", "input_text": " companies face when trying to scale large Web traffic. As Internet backbone technologies develop, many innovations, such as quality of service management, have been used to improve network bandwidth and improve Web content retrieval time. These improvements to infrastructure, however, cannot solve traffic problems occurring at any one point in the Internet. For example, in FIG. 1, an end user 10 in a network 12 in Japan wants to access a page in a content provider original Web site 14 in a network 16 in the U.S. The request will pass through several Internet Service Provider (ISP) gateways 18, 20, and 22 before it reaches the content provider original Web site 14. Because of gateway bottlenecks and other delay factors along the Internet paths between the user and the content provider original Web site 14, a content fetching and refreshing methodology utilizing a proxy server on the end user side of the gateways could provide faster response time.\nFIG. 2 illustrates a typical Web content delivery and caching scheme 24 which includes a caching system 26 connected to multiple non-specific Web sites 28 and 30. The caching system 26 is comprised of a proxy server or cache server 32 and cache 34. Alternatively, the caching system 26 of FIG. 2 can be replaced by a content delivery services provider and mirror sites, which would be connected to Web sites that have entered into subscriber contracts with the content delivery services provider. These subscriber Web sites will deliver content to the content delivery services provider for mirroring, but will not necessarily notify the content delivery services provider when the content has changed. In addition, it should be understood that the cache 34 may be proxy cache, edge cache, front end cache, reverse cache, and the like.\nIn FIG. 2, when content is delivered from a Web site to cache 34, a header called a meta-description or meta-data is delivered along with the content. The meta-data may be"} +{"output_text": ". In the first stage, the resole precursor(s) and multi-hydroxy phenolic compound(s) are reacted with the modifying agent(s) to form a resole. In the second stage, the resole is reacted with the aldehyde compound(s) to form the dispersed novolak.\nThe dispersed novolak can be used as a binder in a radiation curable composition. The radiation curable composition can be a radiation curable ink, a radiation curable coating", "input_text": " mol of phenolic resin precursor(s). A dispersed resole typically can be obtained by reacting or mixing a resole precursor or a mixture of resole precursors with the modifying agent or a mixture of agents without any other reactants, additives or catalysts. However, other reactants, additives or catalysts can be used as desired. Multi-hydroxy phenolic compound(s) can optionally be included in relatively small amounts in the reactant mixture for the resole.\nHydrophilic resoles typically have a F/P ratio of at least 1.0. According to the invention, hydrophilic resoles having a F/P ratio much greater than 1.0 can be successfully dispersed. For example, it is possible to make an aqueous dispersion of hydrophilic resoles having a F/P ratio of at least 2 and approaching 3, which is the theoretical F/P ratio limit.\nPreferably, the dispersed novolak is produced by reacting 1 mol of modifying agent(s) with 2-20 mol of phenolic resin precursor(s) and, preferably, 2-20 mol of multi-hydroxy phenolic compound(s). An aldehyde compound, preferably formaldehyde, is also required to make the novolak. The aldehyde compound can optionally be added as a separate ingredient in the initial reaction mixture or the aldehyde compound can be generated in situ from the resole precursor. The resole precursor(s), multi-hydroxy phenolic compound(s) and modifying agent(s) co-condense to form the dispersed novolak. The reaction typically is acid catalyzed with an acid such as phosphoric acid. The F/P ratio of aldehyde compound(s) to combined amount of resole precursor(s) and multi-hydroxy phenolic compound(s) in the initial reaction mixture preferably is less than 0.9. Preferably, synthesis of the dispersed novolak is a two stage reaction"} +{"output_text": " The cold heat exchanger is typically located in the kitchen, and the ambient heat exchanger is typically located in the basement. The cold heat exchanger is typically located in the kitchen, and the ambient heat exchanger is typically located in the basement. The cold heat exchanger is typically located in the kitchen, and the ambient heat exchanger is typically located in the basement. The cold heat exchanger is typically located in the kitchen, and the ambient heat exchanger is typically located in the basement. The cold heat exchanger is typically located in", "input_text": " be increased, because having such tube lengths greater than the oscillatory displacement of the working gas does not help transfer more heat.\nThe usual solution to the scaleup of heat exchangers is to increase the number of tubes in proportion to the power, keeping the length and diameter of each tube constant. Such heat exchangers can have hundreds or thousands of tubes. Building such heat exchangers is expensive (because many parts must be handled, assembled, and joined) and such heat exchangers are unreliable (because so many joints must be leak tight). Thermally induced stress imposes an additional challenge to reliability when a geometrically complex heat exchanger is at an extreme temperature, such as a red-hot temperature for an engine or a cryogenic temperature for a refrigerator. Sometimes a pool boiler or heat pipe must be used to enforce isothermality in these circumstances so that thermally induced stresses are eliminated.\nAnother shortcoming of oscillating-wave engines and refrigerators is that their heat exchangers often must be located close to one another, simply because each heat exchanger must typically be adjacent to one end or the other of the nearest stack or regenerator or pulse tube or thermal buffer tube, and these components themselves are typically short. The practical importance of this shortcoming is easily appreciated by considering the food refrigerator in the typical American kitchen. The xe2x80x9cvapor compressionxe2x80x9d (also known as xe2x80x9creverse Rankinexe2x80x9d) cooling technology employed therein allows complete flexibility in the geometrical separation of the cold heat exchanger, where heat is absorbed from the inside of the cold box, and the ambient heat exchanger, where waste heat is rejected outside, to the air in the kitchen. The cold heat exchanger is typically located inside, above, or under the freezer, and the ambient heat exchanger is typically located behind or under the refrigerator cabinet."} +{"output_text": " the random data portion 24, the ECC circuit 28 and error correction coding/decoding cannot be used.\nReferring now to FIG. 4, the embedded memory 14 typically includes a random data portion 24 and a cache data portion 26. Bits that are stored in the random data portion 24 are accessed individually. In contrast, bits that are stored in the cache data portion 26 are accessed in blocks having a minimum size such as 16 or 64 bits.\nTo improve reliability, an error correction", "input_text": " read channels, a hard disk controller, an Error Correction Coding (ECC) circuit, high speed interfaces, and system memory. The logic 12 may include standard logic module(s) that are provided by the manufacturer and/or logic module(s) that are designed by the customer. The embedded memory 14 typically includes static random access memory (SRAM), dynamic random access memory (DRAM), and/or nonvolatile memory such as flash memory.\nReferring now to FIG. 2, low chip yield is due in part to the small size of the memory cells in the embedded memory 14. The small memory cells are used to reduce the chip size and lower cost. Typical defects include random single bit failures that are depicted at 16. For a 64 Mb memory module, on the order of 1000 random single bit failures 16 may occur. Other defects include bit line defects that are depicted at 18 and 20. While bit and word line defects occur less frequently than the random single bit failures 16, they are easier and less costly to fix.\nReferring now to FIG. 3, the embedded memory 14 typically includes a random data portion 24 and a cache data portion 26. Bits that are stored in the random data portion 24 are accessed individually. In contrast, bits that are stored in the cache data portion 26 are accessed in blocks having a minimum size such as 16 or 64 bits.\nTo improve reliability, an error correction coding (ECC) circuit 28 may be used. ECC coding bits 30 are used for ECC coding. For example, 2 additional bits are used for 16 bits and 8 additional bits are used for 64 bits. The ECC circuit 28 requires the data to be written to and read from the embedded memory 14 in blocks having the minimum size. Therefore, the ECC circuit 28 and error correction coding/decoding cannot be used for the random data portion 24. When accessing"} +{"output_text": " reorient the SmA director in the field direction. This field is known as the \u201csmectic electric field\u201d and is the primary cause of the dynamic scattering.\nThe dynamic scattering is a result of the movement of ions through the SmA electrolyte, which is a consequence of the movement of the dopant ions through the SmA electrolyte. The movement of the dopant ions is a consequence of the movement of the ionic charges through the SmA electrolyte. The movement of the ionic charges is", "input_text": " mono-domain, with the layers of the material lying parallel to the substrate and the directors of the individual LC molecules lying orthogonal to the layers and to the substrates. For many SmA materials this situation is only reversible by re-heating the cell to a nematic phase and so destroying the SmA alignment.\nBecause the switching from a clear state to a scattering state can only be reversed by such heating and subsequent cooling, SmA liquid crystals, with positive dielectric anisotropy, cannot alone form the basis of a practical electro-optic phenomenon. However a light scattering state can be electrically induced from a mono-domain clear state by smectic dynamic scattering (SDS), as described below, that disrupts the mono-domain state to form multi-domains, which allows a display to be reversibly switched between a homeotropically aligned clear transparent state and a disordered light scattering state. These two states are visible without polarised light.\nSmectic dynamic scattering uses a suitable ionic dopant that is dissolved in the smectic A liquid crystal host; under the influence of low frequency (e.g. <500 Hz) electric fields, two orthogonal forces attempt to reorient the SmA director. Dielectric re-orientation, as described above, attempts to align the SmA director (indicating the average direction of the long molecular axis) in the field direction, i.e. orthogonal to the plane of the electrodes/substrates. Simultaneously, the movement of dopant ions through the SmA electrolyte attempts to align the SmA director in the direction in which ions find it easier to travel. In SmA materials this direction is within the SmA layers, which lie orthogonal to the field direction, i.e. SmA materials have a \u201cnegative conductivity anisotropy\u201d. The cumulative effects of the movement of the ionic charges leads to a field arising in the plane of the layers that attempts to"} +{"output_text": " of the backplane, and the clock unit is located at the central node of the MCH slot. The clock unit is connected to the MCH slot through a clock bus. The clock unit is configured to generate a clock signal, and the MCH slot is configured to receive the clock signal. The clock signal is transmitted to the MCH through the clock bus.\nThe clock unit is configured to generate a clock signal, and the MCH slot is configured to receive the clock signal. The", "input_text": " provides data switching of at most 12 AMC cards. The clock unit implements the clock function of the system, including selecting a clock source, generating a system synchronization clock, and driving the generated system clock to each connected AMC card. The CLKx is configured to identify the clock type, and the JTAG unit is configured to perform the test function of the system.\nThe MicroTCA.0 R1.0 standard defines the maximum height of the MCH to be 6 HP (about 3 cm), which may be implemented through at most 4 Printed Circuit Boards (PCBs). That is, one MCH is composed of 4 PCBs, and the 4 PCBs of the MCH are limited to a 6-HP height. The first PCB enables the MCMC to manage the MicroTCA carrier, and provides the basic switching functions (Fabric A port). The second PCB is configured to implement the Fabric B port for system clock and data switching; the third and fourth PCBs provide Fabric C-to-Fabric G ports (namely, Fat Pipe) of Fabric switching on the data plane. Meanwhile, as defined in the standard, the connector between the MCH and the backplane is composed of 4 tongues, where tongue 1 provides a connection pin between the MCMC unit and JTAG unit, tongue 2 provides a connection pin of the clock unit, and tongue 3 and tongue 4 provide connection pins of the Fat Pipe (namely, ports from Fabric C to Fabric G).\nFIG. 3 shows redundancy clock architecture of a MicroTCA carrier in the conventional art. In the MicroTCA.0 R1.0 standard in the conventional art, the MCH implements the clock functions. The MicroTCA system adopts a star clock topology. In the backplane, it is defined that an MCH slot is located at the central node"} +{"output_text": "control circuit 15.\nThe operation of the conventional MUSE signal interpolating circuit will be described below.\nThe pixel signals in the present frame are applied to the input terminal 1. The pixel signals one frame before and two frames before are applied to the input terminals 2 and 3, respectively. The pixel signals in the present frame are applied to the input terminal 4. The pixel signals one frame before and two frames before are applied to the input terminals 5 and 6, respectively. The pixel signals in", "input_text": " to an output signal from an EXOR circuit 39 to output a signal Sc in which the pixel signals included in the signal Sa are interpolated between the pixel signals one frame before in place of the pixel signals two frames before. A frame memory 26 for delaying the input signal Sc by approximately one frame period is provided in the interframe interpolation circuit 14. The frame memory 26 comprises field memories 27 and 28 each constituting one field delay circuit. This one-field memory 28 has its delay time controlled responsive to a motion vector signal in order to correct a motion vector.\nA motion detecting circuit 20' receives the respective signals in the present frame, one frame before and two frames before. As mentioned above, since the sampling points of the MUSE signal are circulated every two frames (four fields), the motion detecting circuit 20, detects a motion by comparing the pixel signals in the present frame and those two frames before (the detection of the difference in motion between every two frames). Since the motion detection is incomplete only by detecting the difference between every two frames, the detection circuit 20' also detects a motion by comparing the pixel signals in the present frame with those one frame before. This motion detection between any frames is carried out by comparing signal components equal to or less than 4.2 MHz, which have no folding distortion generated by sub-sampling in the still picture. The signals, which represent the amount of motion detected by these two motion detecting operations, are applied to the mixing circuit 21, and the mixing ratio is controlled as described above.\nThe clock signal of 16.2 MHz is applied to the EX-OR circuit 39 through an input terminal 29. A phase control signal for interpolating the pixel signals in the present frame in place of the pixel signals two frames before by the switch S1 is applied to the EX-OR circuit 39 through the other terminal 30. This phase control signal is generated in the synchronization/"} +{"output_text": " around a vertical axis, which is the most common solution, or by sliding the door in a vertical direction, which is the most common solution for hanging furniture.\nIn the first case, the door is hinged on one side of the piece of furniture, and the passage from one position to another takes place by rotation around a vertical axis.\nIn the second case, the door is hinged on the top of the piece of furniture, and the passage from one position to another takes place by", "input_text": " door is cantilevered, corresponding to the top of the piece of furniture or an overlying shelf. The invention further relates to a piece of furniture.\nIn the furniture industry, furniture is normally made of parallelepiped formats with, as is known, two vertical parallel sides, a top, a base and an optional rear wall, defining an internal compartment where objects, food, clothing and so on can be stored.\nThe front opening through which said internal compartment is accessed is selectively closed by means of one or more doors, movable between a closing position, in which access to the compartment is prevented, and an opening position, in which the interior contents can be accessed.\nThe passage between said closing position and said opening position can be carried out in different ways: in fact, the doors can be, for example, hinged on one side of the piece of furniture in such a way that the passage from one position to another essentially takes place by rotation around a vertical axis, thus a considerable space is required and the user's lateral movement is hindered.\nTo overcome this drawback, sliding doors have been proposed, in which the passage from one position to another takes place by lateral or vertical sliding; however, access to a part of the interior compartment of the piece of furniture is difficult or impossible.\nRecently, in particular, regarding hanging furniture for example for kitchens, but also living rooms and bathrooms, the market is aiming at opening/closing systems where the door in an opening position is \u201ccantilevered\u201d from the body of the piece of furniture corresponding to the cover of the piece of furniture or possibly on the top shelf in the case of furniture with more spaces; this solution is highly appreciated as it is more ergonomic and allows for easy lateral movement by users even when the door is in the opening position.\nIn particular, the opening position can be achieved by ensuring door rotation"} +{"output_text": " field, and by adding, deleting or changing the definition of a record; and (iii) the DBA may add, delete or change the definition of a field or a record in the data set.\nA \u201crecord\u201d is a collection of fields, each of which is a data item. A record is a unit of data that the system reads from or writes to a file in one execution cycle of a Read or Write statement in a program.\nA \u201cfield\u201d is a consecutive group", "input_text": " a record is called a row. Data items reside in different fields in the records. For example, a record might involve a series of data such as an employee's name, the employee's I.D., the employee's social security number and years of employment. A group of such records would constitute a file.\nThe operating system, which is used by the data management system, will treat the record as a unit. The system makes data available to users in records and not in individual single items of data. In programming languages, the record is the unit of data that the system reads from or writes to a file in one execution cycle of a Read or Write statement in a program.\nIf the application program wants to change a data item in a given record, the Data Management System brings a copy of the record from the physical storage over to memory, then enables that data item to be changed, and then writes the changed record back to the file.\nA \u201cfield\u201d is a consecutive group of bits or bytes within a particular component of a record, which will represent a logical piece of data. A field or column is defined by the description of the data item it is to hold. For example, if one field carries the name of an employee, this field in the record could be called the name field.\nThe \u201cdata set\u201d is a physical file, that is to say, a collection of related data records stored on a random-access storage device, such as a disk in which the data resides.\nA data set is kept up-to-date in several ways: (i) here, application programs add, change, or delete individual pieces of data or records stored in the data set; (ii) the Database Administrator (DBA) maintains the structure of the data set by keeping the data set within certain maximized limits, by adding, deleting or changing the definition of a"} +{"output_text": " mask system fits.\nU.S. Pat. No. 8,905,982 for \u201cSystem and method for custom-orienting a medical mask to an oral appliance\u201d by inventor Thornton, filed Nov. 29, 2007 and issued Nov. 20, 2011, is directed a medical mask including a body and an orientation structure. The body includes a first polymer, is configured to cover portions of a user's face comprising the user's mouth and at least portions of the user's nose", "input_text": " exposed to light through the clear areas of the mask.\nU.S. Pat. No. 8,020,276 for \u201cSystem and method for custom-orienting a medical mask to an oral appliance\u201d by inventor Thornton, filed Nov. 29, 2007 and issued Nov. 20, 2011, is directed a medical mask including a body and an orientation structure. The body includes a first polymer, is configured to cover portions of a user's face comprising the user's mouth and at least portions of the user's nose comprising the nostrils, and is further configured to contact the user's face surrounding the covered portions of the user's face to substantially prevent gas from escaping between the body and the contacted portions of the user's face. The orientation structure is configured to receive an oral appliance post to establish and maintain a custom orientation between the medical mask and the oral appliance post and the orientation structure includes a deformable material which includes a second polymer capable of transitioning between deformable and non-deformable states.\nU.S. Pat. No. 8,254,637 for \u201cMask fitting system and method\u201d by inventors Abourizk et al., filed Jul. 26, 2007 and issued Aug. 28, 2012, is directed a system and methods for selecting a mask system for a patient, where certain example embodiments include generating 3D contours of patients and selecting mask systems based at least on those contours. These contours may be generated by using, for example, a cushion of translatable pins, a nasal cannular scanning device, and/or a shadow stereopsis sensor. Certain other example embodiments allow images and/or videos to be captured and optionally synchronized. Then, images of various mask systems may be overlaid to determine how well a mask system fits. In still other embodiments, a user can hold a transparency corresponding to a mask design in front of the patient's face to determine how well a"} +{"output_text": "\nThe defibrillation shock is usually delivered via the large-area defibrillation electrode. The defibrillation shock is usually delivered in the form of a biphasic pulse, i.e., a first phase of a first polarity and a second phase of a second polarity. The first phase is usually a relatively strong current pulse of a first polarity, which is followed by a second phase of a second polarity, which is usually a relatively weak current pulse. The first phase is usually a relatively", "input_text": " vein branching off therefrom into proximity to the left ventricle and may have a small-area stimulation electrode and/or sensing electrode there.\nThe typical stimulation modes which are implementable with a heart stimulator may be assumed to be known (VVD, DDD, etc.) so they need not be explained further here.\nBeyond the properties of a heart pacemaker already described here, of delivering to the heart a stronger current pulse, which should not only stimulate (depolarize) a small portion of the myocardium but should depolarize the largest possible amount of myocardium and thus make it refractory to thereby interrupt the typical cycling stimulation of the myocardium that is typical of fibrillation. Such a pulse is known as a defibrillation shock. It is typically delivered via a large-area defibrillation electrode in comparison with the stimulation electrode or sensing electrode.\nThis is often implemented in the form of a shock coil on the outer surface of the electrode line in the respective chamber of the heart. For example, a ventricular electrode line in addition to a tip electrode or a ring electrode for stimulation and sensing may also have a ventricular shock coil as well as a proximal shock coil situated in the superior vena cava after implantation.\nA defibrillation shock is usually delivered when the heart stimulator detects a fibrillation, i.e., an irregular high-frequency intrinsic activity of the heart which does not lead to complete contraction of the respective chamber of the heart. Such a fibrillation is classified as a tachycardiac arrhythmia, which includes tachycardias in addition to fibrillations. In contrast with fibrillation, complete contraction of the respective chamber of the heart occurs regularly in tachycardia but at a higher rate than would be physiologically appropriate. Such tachycardias can often be treated by antitachycardiac stimulation and do not require a defibrillation shock. Fibrillations are usually treated with a defibrillation shock."} +{"output_text": " operating conditions.\nThe described invention is particularly useful in the field of electrical machines such as motors and generators. The invention is also useful in the field of electrical power transmission and distribution systems.\nThe invention is particularly useful in the field of electrical machines such as motors and generators. The invention is also useful in the field of electrical power transmission and distribution systems.\nThe invention is particularly useful in the field of electrical machines such as motors and generators. The invention is also useful in the field of electrical", "input_text": " interlaminar insulation resulting in conduction between laminations. (Type VII).\ne. Turn-to-turn insulation faults within a particular primary coil due to insulation failure or damage. (Type I).\nf. Open-circuit fault whereby conductors separate within a particular primary coil interrupting the flow of current. (Type IV).\ng. Open-circuit fault whereby conductors or coil groups separate in the machine end-region or terminal box area interrupting or altering a normal flow of phase current. (Type V).\nh. Partial discharge or electric dielectric failure of the primary coil insulation system resulting in an increased progressive degradation of the primary insulation system and not necessarily resulting in a short-circuit current condition. (Type VIII).\nThe other aspect of the described invention is the method and apparatus for segregating or isolating selected sections of primary phase windings or regrouping the primary winding to minimize the affects of a primary electrical fault on the overall operation of the motor or generator. The invention uses high speed electronic switching devices which are connected to either individual primary coils or to primary phase groups which are responsive to signals from the master control system. This system performs both a diagnostic function to determine the location and type of fault and then proceeds to determine a schedule of switching of primary coil members to allow for isolation of the electrical fault section(s) and to enhance a specific output of the electrical machine such as torque, voltage output or reactive power.\nA particularly novel aspect of the described invention is the ability to segregate primary coil groups in a fashion that results in the airgap magnetic flux being maintained in a symmetrical electromagnetic condition whereas otherwise the location and magnitude of the fault would cause a large asymmetry in airgap flux spatial distribution and magnetic core magnetic flux resulting in overheating or unacceptable magnetic overloads. The criteria for performing selective coil isolation is prescribed by the master control system and dependent on the machine's specific"} +{"output_text": " is not in range of the monitor. \nThe monitor will then send a signal to the remote to notify the parent that the child has entered the water.\nThe monitor is a radio receiver and transmitter. The monitor is powered by a battery. The monitor is a radio receiver and transmitter. The monitor is powered by a battery. The monitor is a radio receiver and transmitter. The monitor is powered by a battery. The monitor is a radio receiver and transmitter. The monitor is powered by a battery", "input_text": " in good working order.\nThe difference is that Curcio patent is for a local tracking, the remote and monitor must have radio with line-of-sight between the two. In the case of a lone-pilot or no one noticing someone falls overboard, the device will not work. Line of sight on a flat surface like a calm ocean is limited to a few miles if the monitor has an antenna on a very tall mast on a ship. However, if the seas have swells the line of site distance is very limited.\nIf the ship, which contains the monitoring device, should sink, this device is useless in informing authorities of the location of the survivors. Additionally, at sea, this device requires that each vessel or using the device must buy equipment for both the remote and monitor. The monitor equipment must be properly installed, powered, manned and maintained. If the monitors circuits, batteries, antenna should fail, the remote is useless. Full redundancy of monitoring equipment is needed to ensure a high reliability of service. This is a sufficient cost for each vessel to bear in terms of equipment, space and manpower. The device is also useless for lone-ship pilots, air-plane crews, or life rafts adrift where no monitor is available.\nOn land the tracking distance for this system will be very limited by ground obstructions and only good for short distances when the monitor equipment is available, manned, powered and maintained.\nBurks U.S. Pat. No. 6,317,050 Nov. 13, 2001 Water entry alarm system\nBurks water enter alarm system is a harness intent for children in a pool area. The remote (child with a water entry harness on) will trigger the alarm on the monitor radio if the remote does one of the following; enters the water, or the belts are unhooked, or the radio"} +{"output_text": " the loop (I). The power loss (P) is given by:P=XL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0fL2I=2\u03c0", "input_text": "1. A node is formed between switches Q1 and Q2 to which one end of an inductor 206 is connected. The other end of inductor 206 is connected to the boost circuit of buck-boost DC-to-DC converter 22 at a second connecting two switches: a high side boost switch Q4 and a low side boost switch Q3 together in series where the source of Q4 connects to the drain of Q3 to form node B. The drain of Q4 and the source of Q3 connect across an output capacitor C2 to produce the output voltage Vout of buck-boost DC-to-DC converter 22.\nAt higher switching frequencies of switched inverters/converters, lower values of reactive components can be used in circuit to achieve the required output characteristics of the inverters/converters. However, the increase in frequency can have the undesirable effect of increasing electromagnetic interference (EMI) if good circuit design and good circuit layout practices are not followed. Remembering that currents flowing in a closed path, i.e. a loop (formed by circuit board traces) acts as an efficient radiator of electromagnetic energy, maximum radiation efficiency occurs when the loop dimension is on the order of one-half wavelength. To minimize the radiation efficiency, that is to reduce radiated noise, the loop is made as physically small as possible by being aware of parasitic inductances in the board traces. High-frequency currents follow the path of least impedance (and not the path of least resistance) and a way to reduce the inductive impedance (XL=2\u03c0fL) of parasitic inductances (L) is to reduce the frequency (f) or to reduce the size of the loop, since a longer loop gives more parasitic inductance (L). Power loss (P) in the loop is the product of the inductive impedance (XL) squared and the high frequency current in"} +{"output_text": ") mouth. This distance is referred to as the xe2x80x9cnear-endxe2x80x9d to the xe2x80x9cfar-endxe2x80x9d. The near-end and far-end are often referred to as the xe2x80x9cnear-endxe2x80x9d and xe2x80x9cfar-endxe2x80x9d, respectively.\n", "input_text": " directories may be hard to locate or decipher, especially for non-English speakers or for persons with little or no time, again such as emergency personnel. Consequently, some buildings provide color stripes along walls that serve as color coding to guide visitors to various areas within the building. Unfortunately, the number of color stripes that may be patterned is quite limited, and the expense and defacing of appearance associated therewith is undesirable. Furthermore, such striping does not completely alleviate confusion, and the color stripes can only serve as general guides to commonly visited areas.\nThe art referred to and/or described above is not intended to constitute an admission that any patent, publication or other information referred to herein is \u201cprior art\u201d with respect to this invention. In addition, this section should not be construed to mean that a search has been made or that no other pertinent information as defined in 37 C.F.R. \u00a71.56(a) exists.\nAll U.S. patents and applications and all other published documents mentioned anywhere in this application are incorporated herein by reference in their entirety.\nWithout limiting the scope of the invention, a brief summary of some of the claimed embodiments of the invention is set forth below. Additional details of the summarized embodiments of the invention and/or additional embodiments of the invention may be found in the Detailed Description of the Invention below.\nA brief abstract of the technical disclosure in the specification is provided for the purposes of complying with 37 C.F.R. \u00a71.72. The present invention relates to communications systems, and more particularly, to methods and apparatus for mitigating the effects of disruptive background noise components in communications signals.\nToday, technology and consumer demand have produced mobile telephones of diminishing size. As the mobile telephones are produced smaller and smaller, the placement of the microphone during use ends up more and more distant from the speaker\"\"s (near-end"} +{"output_text": " driving pixels are manufactured from a single substrate.\nIn the case where a plurality of semiconductor devices are manufactured from a single large area substrate, a plurality of semiconductor devices are formed on the substrate. Then, the substrate is divided into a plurality of areas. Then, laser annealing is performed for each area. Thus, a crystalline semiconductor film can be formed in each area.\nHowever, in the case where laser annealing is performed for each area, a laser light is irradiated with an overlap state while being", "input_text": " laser lights are temporarily synthesized by a cylindrical lens 104. After that, the laser lights are reflected by a mirror 107 and then condensed by a doublet cylindrical lens 108 so that they become again single laser light on the surface to be irradiated 109. The doublet cylindrical lens 108 is a lens composed of two cylindrical lenses. Thus, an energy density distribution of the linear laser light in a width direction is homogenized.\nFor example, an excimer laser in which a size in a laser window is 10 mm\u00d730 mm (which each are a half-width in beam profile) is used as the laser 101 and laser light is produced by the optical system having the configuration shown in FIGS. 7A and 7B. Then, linear laser light which has a uniform energy density distribution and a size of 125 mm\u00d70.4 mm can be obtained on the surface to be irradiated 109.\nAt this time, when, for example, quartz is used for all base materials of the optical system, high transmittance is obtained. Note that coating is preferably conducted for the optical system such that transmittance of 99% or more is obtained at a frequency of the used excimer laser.\nThen, the linear laser light formed by the above configuration is irradiated with an overlap state while being gradually shifted in a width direction thereof. Thus, when laser annealing is performed for the entire surface of an amorphous semiconductor film, the amorphous semiconductor film can be crystallized, crystallinity can be improved to obtain a crystalline semiconductor film, or an impurity element can be activated.\nAlso, an area of a substrate used for manufacturing a semiconductor device is being increased more and more. This is because high throughput and a low cost can be realized in the case where a plurality of semiconductor devices such as liquid crystal display device panels are manufactured from a single large area substrate as compared with, for example, the case where TFTs for"} +{"output_text": " and the space available for the attachment is limited.\nIt is a further object of the present invention to provide an attachment for a partial denture which may be used with a variety of abutment teeth and which may be used with a variety of abutment teeth and which may be used with a variety of abutment teeth and which may be used with a variety of abutment teeth and which may be used with a variety of abutment teeth and which may be used with a", "input_text": " joined to the abutment and a vertically oriented male member on the support of the mounted partial denture. It has a disadvantage in that everytime it is removed and reattached there is a great mechanical shock to the abutment tooth. Furthermore, the connector and protruder of this attachment can become worn due to abrasion and their ability to support the denture is thereby weakened. This type of attachment has another disadvantage in that it may not be used if the space available on the abutment tooth is less than 4 millimeters. Finally the cost of using the Ceka type attachment is high because it may only be used with expensive porcelain fused metal crowns.\nIn the Inoue type attachment a small protruder is mounted on the crown of an abutment tooth and a sleeve and lock pin are formed at the base portion of the partial denture. The lock pin is inserted in the protruder to fix the partial denture. In comparison with the other mentioned methods of attachment this method is comparatively safe but still has significant shortcomings. The sleeve and lock pin are difficult to operate by hand. Great damage can be done to the gingiva during attachment and it may only be used when a unilateral tooth is missing. Finally, this attachment is rather unsightly due to the external exposure of the lock pin.\nIt is therefore an object of the present invention to provide a removable attachment which does not require fabrication to narrow tolerances or from expensive materials and which may be economically replaced if refitting is necessary.\nIt is a further object of the present invention to provide an attachment for a partial denture which will not damage the abutting teeth or the gingiva.\nIs is another object of the present invention to provide a removable attachment for a partial denture which may be used in a variety of applications including cases where the abutment teeth are few in number"} +{"output_text": " method of fabricating a membrane probe that overcomes the problems of the prior art. What is also desired is a method of fabricating a membrane probe that provides a more uniform spacing of devices on the membrane probe. What is also desired is a method of fabricating a membrane probe that provides a more uniform spacing of devices on the membrane probe. What is also desired is a method of fabricating a membrane probe that provides a more uniform spacing of devices on the membrane probe. What is also desired is", "input_text": " density of the beams 90. Accordingly, regions of the membrane probe that have a denser spacing of devices, the photoresist layer 79 (and 81) will be thicker on average than regions of the membrane probe that have a less dense spacing of devices. During the exposing and etching processing of the photoresist layer 79 (and 81), the duration of the process depends on the thickness of the photoresist 79 (or 81). With variable photoresist thickness it is difficult to properly process the photoresist to provide uniform openings. Moreover, the thinner regions of photoresist layer 79 (or 81) will tend to be overexposed resulting in variably sized openings. Also, the greater the photoresist layer thickness 79 (or 81) the greater the variability in its thickness. Accordingly, the use of photoresist presents many processing problems.\nFifth, separate alignment processes are necessary to align the beams 90 on the traces 76a, the contact bumps 92 on the beams 90, and the contacting portions 93 on the contact bumps 92. Each alignment process has inherent variations that must be accounted for in sizing each part. The minimum size of the contacting portions 93 is defined primarily by the lateral strength requirements and the maximum allowable current density therein. The minimum size of the contacting portions 93, accounting for the tolerances in alignment, in turn defines the minimum size of the contact bumps 92 so that the contacting portions 93 are definitely constructed on the contact bumps 92. The minimum size of the contact bumps 92, in view of the contacting portions 93 and accounting for the tolerances in alignment, defines the minimum size of the beams 90 so that the contact bumps 92 are definitely constructed on the beams 90. Accordingly, the summation of the tolerances of the contact bumps 92 and the contacting portions 93, together with a minimum size of the contacting portions 93, defines the minimum device size, and thus defines the minimum pitch between contact pads.\nWhat is desired, therefore, is a"} +{"output_text": " 10 is comprised of an oxide film.\nA contact hole is then formed in the first insulating film 10, as shown in FIG. 2C. The contact hole is formed by etching the first insulating film 10 using the insulating film spacers 7 as a mask.\nA conductive layer is then formed on the first insulating film 10, as shown in FIG. 2D. The conductive layer is comprised of a polysilicon film. The conductive layer is formed by depositing a polysilicon film", "input_text": " order to solve such problems, there have been proposed techniques of provision of a contact plug formed by burying the conduction layer in a lower portion of the contact hole and formation of a contact pad being in contact with the semiconductor substrate.\nThese conventional techniques will now be described in conjunction with FIG. 1 and FIGS. 2A to 2F.\nFIG. 1 shows a layout of a MOSFET with a conventional contact structure. In FIG. 1, the MOSFET structure is shown in symmetry to its drain. As shown in FIG. 1, the MOSFET includes an active mask 50, a word line mask 52, a source/drain contact mask 54, a drain contact mask 56, a first conduction wiring mask 58 and a source contact mask 60.\nFIGS. 2A to 2F are cross-sectional views respectively taken along the line X-X' of FIG. 1, showing a conventional method for forming contact plugs respectively on a source and a drain of the MOSFET structure shown in FIG. 1 and forming a conduction layer to come into contact with the contact plugs.\nIn accordance with this method, first, an insulating film 2 for an element isolation is formed on a predetermined portion of a semiconductor substrate 1, as shown in FIG. 2A. On the other portion of the semiconductor substrate 1, a MOSFET is then formed to include a gate oxide film 3, gate electrodes 4, a source 6 and a drain 6'. Thereafter, an insulating film 5 and insulating film spacers 7 are formed on the upper surface and side surfaces of each gate electrode 4, respectively. The insulating film 5 is comprised of an oxide film.\nA thin oxide film 8 is then formed on the exposed source 6 and drain 6', as shown in FIG. 2B. Over the entire exposed surface of the resulting structure, a first insulating film 10 for planarization is formed. The first insulating film"} +{"output_text": " data rate that can be achieved because of the quantization of the signal.\n2. Modem\nA modem is a device that converts digital data into analog data and vice versa. Modems are used to connect a computer to a telephone line. The modem converts digital data into analog data and vice versa. The modem also converts digital data into a modulated carrier and transmits the modulated carrier over the telephone line. The modem then converts the modulated carrier back into digital data and transmits the digital data to the computer", "input_text": " orthogonal cosine and sine components (also referred to as inphase (I) and quadrature (Q) channels), a modulated carrier may be thought of as the sum of a modulated sine wave and a modulated cosine wave.\nAs is well known in the art, a two-dimensional plane, or I-Q plane, is used as a shorthand notation to represent the amplitude and phase of the carrier. The signals that make up a signal constellation are represented as points in the I-Q plane, which are usually set out in a grid-like fashion. A particular signal point may be specified as a coordinate pair in the I-Q plane. The points in the I-Q plane arc also generally referred to as a baseband representation of the signal because the points represent the amplitudes by which the sine and cosine components of a carrier will be modified. Each \"signal point\" is also referred to herein as a \"symbol.\"\nWhile the invention described herein is applicable to systems that use modulated carriers as described above, the preferred embodiments are essentially baseband systems that do not involve the modulation of a carrier. Consequently, the signal points are selected from a single-dimensional signal space, as opposed to a two-dimensional inphase/quadrature signal space. The system for which the invention is particularly well suited uses the public digital telephone network.\n1. Digital Telephone Network\nFor many years the public digital telephone network (DTN) has been used for data transmission between modems. Typically, a modulated carrier is sent over a local loop to a service provider (e.g., a Regional Bell Operating Company), whereupon the service provider quantizes the signal for transmission through the DTN. A service provider that is located near the receiving location converts the digital signal back to an analog signal for transmission over a local loop to the receiving modem. This system is limited in the maximum"} +{"output_text": " 5 shows a voice production model. In the process of voice production, the sound source signal produced by the sound source (vocal chords) 110 is input into a sound adjustment system (vocal tract) 111, and vocal tract characteristics are added in this vocal tract 111. Subsequently, the voice is finally output as a voice waveform from the lips 112 (see \u201cOnsei no Konoritsu Fugoka\u201d [\u201cHigh Efficiency Encoding of Voice\u201d], pp. 69\u2013", "input_text": " increased, the burden on the auditory sense of the listener (user) is increased, which is undesirable from a health standpoint.\nFurthermore, in conventional methods using a high-band enhancement filter, if simple high-band enhancement is used, high bands of noise other than the voice are enhanced, so that the feeling of noise is increased, which does not always lead to an improvement in clarity.\nMoreover, in conventional methods using a band splitting filter, there is no guarantee that the voice formants will always fall within the split frequency bands. Accordingly, there may be cases in which components other than the formants are enhanced, so that the clarity conversely deteriorates. Furthermore, since the input voice is amplified without separating the sound source characteristics and the vocal tract characteristics, the problem of severe distortion of the sound source characteristics arises.\nFIG. 4 shows a voice production model. In the process of voice production, the sound source signal produced by the sound source (vocal chords) 110 is input into a sound adjustment system (vocal tract) 111, and vocal tract characteristics are added in this vocal tract 111. Subsequently, the voice is finally output as a voice waveform from the lips 112 (see \u201cOnsei no Konoritsu Fugoka\u201d [\u201cHigh Efficiency Encoding of Voice\u201d], pp. 69\u201371, by Toshio Nakada, Morikita Shuppan).\nHere, the sound source characteristics and vocal tract characteristics are completely different characteristics; however, in the case of the abovementioned conventional technique using a band splitting filter, the voice is directly amplified without splitting the voice into sound source characteristics and vocal tract characteristics. Accordingly, the following problem arises: namely, the distortion of the sound source characteristics is great, so that the feeling of noise is increased, and the clarity deteriorates. An example is shown in FIGS. 5 and 6. FIG."} +{"output_text": " shown in FIG. 32, the image forming section is required to have a memory for storing the read data of each image forming means. However, when the image forming means is a color image forming means, the memory for storing the read data of each color is required to have a memory for storing the read data of each color. Therefore, the image forming apparatus is required to have a plurality of memories for storing the read data of each color, resulting in an increase in the cost of the image forming", "input_text": " the respective corresponding photosensitive drums, are read by a CCD sensor or the like, the read pattern images are stored in a memory, and the positions of the registration correction patterns for each color are determined on the basis of the density values of the read data in accordance with pattern image data read out sequentially from the memory. In such a case, when the transferred registration correction pattern image cannot be formed clearly due to changes in the environment, or changes in the process conditions such as latent imaging, development or transfer of images, or when an image is formed on scratches or contaminants on the transfer belt, the central position of the registration correction image is erroneously computed on the basis of the read data. As a consequence, an error occurs in the computation of registration deviations of each color, causing the registration to deviate.\nFor example, when an image is formed normally on the transfer belt, the histogram data of the density additions regarding a pattern image in the main-scanning and sub-scanning directions is distributed as shown in FIG. 8. The position of the maximum value of the histogram data matches the central position of the histogram, making it possible to easily compute the central value. However, when the transfer conditions vary as shown in FIG. 30, for example, when data is lost during transfer, causing the density in the central portion to be higher than that in the edge portion of the image pattern, or when a scratch is present on the transfer belt as shown in FIG. 31, the maximum value of the histogram does not match the central value of the image pattern. Therefore, the central position of the registration correction image is erroneously computed on the basis of the read data.\nIn addition, when data written in a memory in block units is read out in block units by the image processing section of the above-described image forming apparatus, and when the image forming section has a plurality of image forming means as"} +{"output_text": ", ELECTROTHERMAL PRINTING UNIT, issued July 6, 1976.\nIn the above-mentioned patents, the heads are arranged in a row and are moved relative to the paper by a pair of spaced-apart drive rollers. The drive rollers are rotated in opposite directions to effect printing of characters or other indicia in dot matrix fashion. The heads are heated by energizing a resistive element in each head to a temperature high enough to color the paper to the desired", "input_text": " opportunity of wood products through improved performance of lower grade lumber and veneer products.\nThe objectives of this invention and associated benefits will become apparent from the following detailed description and accompanying drawings. 1. Field of the Invention\nThe present invention relates to electrothermal printing apparatus in which printing is effected by momentarily heating selected portions of a heat sensitive medium, and more particularly to arrangements in which characters and other indicia are printed on thermally sensitive paper by imparting heat to the paper via an array of heads or other energizable elements movable relative to the paper.\n2. History of the Prior Art\nElectrothermal printing apparatus in which one or more heads or other elements are momentarily heated to heat selected areas of an adjacent thermally sensitive paper or other thermally sensitive medium which discolors in response to the heat to effect printing is well known in the art. In typical arrangments of this type a row of side-by-side heads is often provided for sweeping movement relative to the thermally sensitive paper to effect printing of characters or other indicia in dot matrix fashion. The individual heads typically consist of small resistive elements which must be heated to a temperature high enough to color the paper to the desired degree of resolution. At the same time heating of the head must be done relatively quickly so that only a discrete localized area of the paper is colored as the paper continues to move relative to the heads. Examples of this type of printing apparatus are provided by U.S. Pat. No. 3,951,247 to Montanari, ELECTROTHERMAL PRINTING UNIT, issued Apr. 20, 1976, U.S. Pat. No. 3,989,131 to Knirsch et al, ELECTROTHERMAL PRINTING UNIT, issued Nov. 2, 1976, and U.S. Pat. No. 3,967,092 of Conta"} +{"output_text": " biomolecules. In MALDI, a sample is mixed with a matrix material, which is a solid at room temperature. The sample and matrix are then deposited on a sample plate, which is typically a metal plate. The sample plate is then placed in a vacuum chamber, which is maintained at a pressure of about 10\u22123 Torr. The sample plate is then irradiated with a laser beam, which is typically a pulsed nitrogen laser beam. The laser beam is focused on the sample plate, and the", "input_text": " retransmitted packets are successfully received, the subscriber computers are able to successfully reconstruct the file.\nHowever, this approach becomes increasingly inefficient for large numbers of subscriber computers, because each subscriber computer generally loses packets different from those lost by the other subscriber computers, thus causing the number of retransmitted packets to increase dramatically. In fact, the number of retransmission packets may begin to approach the entire number of originally transmitted packets, thus causing this approach to become as inefficient as retransmission without using back channels (in which the entire file is retransmitted). Furthermore, this approach has no way to account for the possible loss of some of the retransmitted packets on retransmission.\nIt would thus be desirable to provide a method that overcomes the above-described problems, especially in the case of a large number of subscriber computers communicating with a host computer. In particular, a method is provided that reduces the total number of retransmitted packets. Another method is provided that accounts for the likelihood that some of the retransmitted packets may be lost on retransmission, and thus increases the number of retransmitted packets accordingly. Mass spectrometry is a well-established analytical technique in which sample molecules are ionized and the resulting ions are sorted by mass-to-charge ratio. Advances in mass spectrometry have made it possible to obtain detailed information regarding a wide variety of sample surface types. In the semiconductor industry, for example, secondary ion mass spectrometry is used to determine the composition of microscopic regions of wafer surfaces. As another example, in the biotechnology arena, surface-based mass spectrometry is used to analyze single nucleotide polymorphisms in microarray formats. See, e.g., U.S. Pat. No. 6,322,970 to Little et al.\nMatrix-Assisted Laser Desorption Ionization (MALDI) is an ionization technique commonly used for mass spectrometric analysis of large"} +{"output_text": " steering stability can be obtained.\nHowever, in the pneumatic tire described in Japanese Unexamined Patent Application Publication No. 2011-230699A, the tread surface is formed by spirally winding a reinforcing cord made of an organic fiber cord in the tire circumferential direction on the outer side of the belt layer in the tire radial direction. Therefore, the reinforcing cord is exposed on the tread surface, and the reinforcing cord is likely to be damaged by stones or the like.\nIn addition, the", "input_text": " regular rim, inflated to a regular internal pressure, and having no load applied, when viewed in the tire meridian cross section including the tire axis, the radius of curvature R1 of the outer surface of the center land portion is greater than the radius of curvature R2 of the outer surface of the intermediate land portion, and the center of each radius of curvature R1, R2 lies is at the same position.\nIn addition, Japanese Unexamined Patent Application Publication No. 2011-230699A, for example, describes a pneumatic tire comprising at least one carcass and belt layer, a belt cover layer formed by spirally winding a reinforcing cord made of an organic fiber cord in the tire circumferential direction on the outer side of the belt layer in the tire radial direction, and a tread portion comprising a tread surface different in the groove surface area ratio on either side of the tire equatorial plane. The tread surface of such a pneumatic tire has a tread profile in which, when viewed in the tire median cross section, the outermost position in the tire radial direction is located on the side of the tire equatorial plane with the smaller groove surface area ratio, and shoulder drop amounts in the tire radial direction from the outermost position in the tire radial direction at both ends in the tire width direction are equal.\nThe pneumatic tire described in Japanese Unexamined Patent Application Publication No. 2011-230699A is provided with a tread portion that comprises a tread surface in which the groove surface area ratio is different on either side of the tire equatorial plane. Water drainage performance is obtained from the side with the greater groove surface area ratio (that is, the side with more grooves), and steering stability on dry road surfaces is obtained from the side with the smaller groove surface area ratio (that is, the side with less grooves and thus greater rigidity). As a result, both water drainage performance and"} +{"output_text": "E)-isomer, the corresponding trans-alkenyl aryl carbamate ester 38 is first deprotected under standard conditions to give the corresponding trans-alkenyl aryl carbamate acid analog IZf. For the trans- or (Z)-isomer, the corresponding trans-alkenyl aryl carbamate ester 38 is first deprotected under standard conditions to give the corresponding trans-alkenyl aryl carbamate acid analog IZg. \nThe corresponding trans-alkenyl aryl", "input_text": " analogs IZd (Scheme 28). Alternatively, this sequence can be reversed, i.e. the initial step being the deprotection of acetylenic ester 36 to the acetylenic acid, followed by stereoselective reduction of the acetylene moiety to provide the Z-alkene-acid analogs IZd.\nThe corresponding trans-alkenyl aryl carbamate acids IZe can be synthesized according to the general route in Scheme 29. An aryl- or heteroaryl-acetylene 35 (the preferred moiety again being 5-phenyl-2-methyl-oxazol-4-yl-methylacetylene) is halogenated under standard conditions (ref: Boden, C. D. J. et al., J. Chem. Soc. Perkin Trans. I, 1996, 2417; or Lu, W. et. al., Tetrahedron Lett. 1998, 39, 9521) to give the corresponding halo-acetylene, which is then converted to the corresponding trans-alkenyl stannane 37 (ref: Boden, C. D. J., J. Chem. Soc., Perkin Trans. I, 1996, 2417). This aryl- or heteroaryl-substituted trans-alkenyl stannane 37 is then coupled with the halo-aryl carbamate ester 34 under standard Stille coupling conditions (ref: Farina, V. et. al., xe2x80x9cThe Stille Reactionxe2x80x9d, Organic Reactions, 1997, 50, 1) to furnish the corresponding trans-alkenyl aryl carbamate ester 38. This carbamate-ester is then deprotected under standard conditions to give the desired trans-alkenyl aryl carbamate acid analogs IZe.\nThe corresponding cyclopropyl analogs IZf and IZg are synthesized according to Scheme 30. For the cis- or ("} +{"output_text": " are relevant for the merge operation, then the two objects ai and bj are considered to be identical. If two objects ai\u03b5A and bj\u03b5B are identified as identical by mapAB (e.g., (ai,bj)\u03b5mapAB), and if no other entries involving ai or bj exist in mapAB a (\u2203ax s.t. (ax,bj)\u03b5mapAB\u2203by s.t. (ai,by)\u03b5mapAB", "input_text": " to work directly with these representations. Second, even small changes of a model can lead to significant changes of its textual representation, making it hard to differentiate between the actual changes on the model level and purely \u201csyntactical\u201d changes implicated by the textual representation. Text-based tools therefore are not really appropriate for model merging.\nWhen designing a merge system for merging two models A and B, a function may be provided for identifying pairs of elements ai and bj from A and B, respectively, that are considered identical (or at least elements that after a successful merge operation should appear only once in the resulting merged model C). \u201cIdentical\u201d element pairs are discussed herein as a mapping relation mapAB: A\u00d7B.\nIt will be appreciated that mapAB, being a relation, need neither be injective nor surjective nor a function. In general, model elements from A need not have counterparts in B, and vice versa, and an element ai from A could possibly have several \u201cidentical\u201d elements bi1,..., Bin in B and vice versa. In literature, techniques for producing such a mapAB from two models A and B are called schema or model matching techniques. In other scenarios, such a mapAB can also result from the process that created models A and B.\nBased on the content of mapAB, it is possible to distinguish different categories of (pairs of, groups of, individual, etc.) objects from A and B: If two objects ai\u03b5A and bj\u03b5B are identified as identical by mapAB (e.g., (ai,bj)\u03b5mapAB), and if no other entries involving ai or bj exist in mapAB a (\u2203ax s.t. (ax,bj)\u03b5mapAB\u2203by s.t. (ai,by)\u03b5mapAB) and if ai and bj agree on all properties that"} +{"output_text": " of edges that form a loop to a node from which the loop originates.\nA directed graph may be represented by a directed graph data structure. A directed graph data structure may be represented by a directed graph data structure data structure. A directed graph data structure may be represented by a directed graph data structure data structure. A directed graph data structure may be represented by a directed graph data structure data structure. A directed graph data structure may be represented by a directed graph data structure data structure. A directed graph", "input_text": " held stopped with the carriage positioned in a region within which normal recording is not performed (such position of the carriage will be hereinafter referred to as home position), the continuous jet type ink jet recording apparatus is rendered operative to perform test printing in which an ink jet is jetted and deflected a little with respect to a middle point between two paths along which the ink jet follows when the deflecting electric field is operative and inoperative and the charge (electric current) carried by the ink jet having passed by the knife edge is detected to detect the position of the ink jet jetting axis (nozzle axis).\nAnother continuous jet type ink jet recording apparatus has also been proposed by the inventor of the present patent application and is disclosed in applicant's co-pending U.S. patent application Ser. No. 07/784,719 filed Oct. 30, 1991 and in applicant's Japanese Patent Laid-Open Application No. 173151/1992 wherein the continuous jet type ink jet recording apparatus is rendered operative to perform test printing wherein an ink jet is jetted and is subject to deflection which varies continuously or stepwise and the charge (electric current) carried by the ink jet having passed by the knife edge is detected to detect the position of an ink jet jetting axis (nozzle axis). Various applications, including but not limited to the analysis of software programs, benefit from the creation of directed graphs, and more specifically, directed acyclic graphs to represent flow concepts as appropriate to the application. A directed graph may consist of nodes and edges. An edge may connect one node to another, with a direction from one node to the other. Edges may be represented by arrows to indicate the direction. Two edges may be contiguous if one flows into a node and the other flows out of the same node. Directed graphs may have edges that \u201cloop backwards\u201d; that is, it is possible to follow a set"} +{"output_text": " a method for manufacturing a semiconductor device, and more particularly to a method for manufacturing a semiconductor device having a trench isolation structure.\n(2) Description of the Related Art\nIn recent years, a semiconductor device having a trench isolation structure has been developed. The trench isolation structure is formed by forming a trench in a semiconductor substrate and filling the trench with an insulating film. The trench isolation structure is advantageous in that it can reduce the area of a semiconductor substrate occupied by a device isolation region, and can", "input_text": " solution of cadmium and tin bromides in the presence of H.sub.2 O.sub.2 and O.sub.2 is sprayed onto hot substrates at 400.degree. to 1000.degree. C.\nThe available heat flux from solar heat collectors, consisting of black body absorbers and greenhouse windows or of selective absorbers and greenhouse windows, is a function of solar concentration and absorber temperature. For high solar concentrations (>10) and absorber temperatures below 500.degree. C., it has been shown in Technical Report AFML-TR-70-294 that a greenhouse window does not add to the available heat flux. However, for lower solar concentrations, the addition of a greenhouse window contributes significantly to the available heat flux. Obviously, these investigations show that the highest heat flux contribution of a greenhouse window coating can be realized in flat-plate collectors and in pipe collectors which work at high operating temperatures.\nUnfortunately, the solar heat collectors as presently known do not provide for highly efficient heat collection. This lack of efficiency relates primarily to the re-radiating of substantial amounts of infrared thermal energy by the heat collecting surface.\nIt is an object of this invention to provide a solar heat collector of improved efficiency.\nIt is a further object of this invention to provide a solar heat collector having a greenhouse window which increases the heat flux available to the heat collecting system of a solar energy converter.\nOther objects and attendant advantages of the present invention will be apparent from the description thereof taken in connection with the accompanying drawings. The invention is capable of a variety of mechanical expressions, two of which are illustrated in the accompanying drawings. Therefore, it is to be expressly understood that the drawings are for the purpose of illustration only, and are not intended to present the full scope of the invention which is defined by the appended claims. (1) Field of the Invention\nThe present invention relates to"} +{"output_text": " part 71 for controlling the operation of the thermal developing part 47 and the cooling part 61. The controlling part 71 is constituted by a CPU (Central Processing Unit) and a memory, and controls the operation of the thermal developing part 47 and the cooling part 61.\nThe thermal developing part 47 and the cooling part 61 are controlled by the controlling part 71 in such a manner that the recording material A is thermally developed in the thermal developing part 47 and is cooled in the cooling part 61.\nThe thermal", "input_text": ", e.g., nichrome wire, a light source, e.g., a halogen lamp, and means for heating with hot air.\nThe press roller 53 may be a metallic roller, a resin roller, a rubber roller or the like, and is disposed over the entire length of the drum 51 in the axial direction thereof. The heater H used as a heating source of the press roller 53 is also not particularly limited, and may be one using known heating unit, such as a heating element, e.g., nichrome wire.\nIn the thermal developing part 47, upon transporting the recording material A to the conveying path C, the first surface is pressed with the press rollers 53 to press the second surface to the drum 51, whereby both the first and second surfaces of the recording material A are simultaneously heated. According to the operation, both the surfaces of the recording material A can be heated uniformly in a short period of time. In this constitution, the drum 51 and the press rollers 53 are rotated as being synchronized with the conveying speed of the recording material A, whereby there is no deviation in relative position of the heating unit and the recording material A, and the recording material A is not scraped.\nThe recording material A having been developed in the thermal developing part 47 is fed to an cooling part 61 disposed at a downstream side of the conveying direction. The cooling part 61 is constituted by plural cooling rollers 63 and has such a function that the recording material A having been thermally developed is gradually cooled, and therefore, the cooling part 61 is set at such a temperature that is higher than non-heated members but is lower than the thermal developing temperature. The recording material A thus slowly cooled in the cooling part 61 is transported in the downstream side of the conveying direction with a pair of delivering rollers 65 and 67 and delivered to a tray 69.\nThe thermal developing apparatus 100 has a controlling"} +{"output_text": " tape\" and the wafer is then diced into individual chips. U.S. Pat. No. 5,288,663 teaches the use of a frame to hold the wafer in place during dicing. U.S. Pat. No. 5,169,804 teaches the use of a frame to hold the wafer in place during dicing. U.S. Pat. No. 5,654,204 teaches the use of a frame to hold the wafer in place during dicing. U", "input_text": " filed Jul. 31, 1996, entitled Fixtures And Methods Of Lead Bonding and Deformation. In certain preferred embodiments taught in the '532 application, the sheet may be stretched by initially attaching it to a ring formed from a material of relatively high coefficient of thermal expansion such as aluminum at a low temperature such as room temperature, then heating the sheet and high-expansion ring and then attaching the sheet to a lower expansion ring such a molybdenum ring. As disclosed, for example, in said U.S. Pat. No. 5,798,286 and in the corresponding PCT International Publication WO 97/11486, the disclosure of which is also hereby incorporated by reference herein, a frame-stretched sheet can be used in other assembly processes using individual semiconductor chips mounted individually to the sheet or mounted on a platen in a preselected array and bonded to the sheet as a unit.\nFramed sheets have also been employed in unrelated arts and for different purposes. For example, thin framed sheets referred to as pellicles used in the optical arts as optical beam splitters as shown, for example, in Edmund Scientific, 1997 Optics and Optical Instruments Catalog, p. 56. U.S. Pat. No. 4,037,111 discloses the use of a mechanically stretched sheet held taut by a borosilicate glass frame as a mask for X-ray lithography. German Offenlegungssachrift DE-3,919,564 A1 discloses fabrication of printed circuits by silk-screening onto a polyimide film held taut by an aluminum frame.\nU.S. Pat. Nos. 3,537,169; 5,288,663; 5,169,804; 5,654,204; 3,562,058 and 5,362,681 teach processes in which a wafer is adhered to a plastic film or \"dicing"} +{"output_text": " travel is advantageous. This is because the oscillating lever is then able to rotate around the longitudinal axis of the connecting pin, and the oscillating lever is able to perform a rotational movement crosswise to the direction of travel.\nIn an embodiment, the oscillating lever is designed so as to be able to rotate around the longitudinal axis of the connecting pin. This is advantageous because the oscillating lever is able to perform a rotational movement crosswise to the direction of travel.\nIn an embodiment, the oscillating lever is", "input_text": " lever and the mounting, in particular the steering stub mounting, the torsion resistance of the connecting pin can be varied in accordance with the length of reinforcement, and it can be calculated in advance. Since the ratio between flexural stiffness and torsion resistance of a profile rod is variable in many areas with the design of the profile cross-section, the wall thickness, dimensions and/or profile design, it is possible to link the individual compressions of an axle with each other. By increasing the torsion resistance of the connecting pin accordingly, there is stronger coupling of the compression, so that the rolling angle of the vehicle, i.e. rotational movement around the longitudinal axis, can be prevented specifically. This is imperative especially in superstructures that create an elevated vehicle center of gravity.\nEmbodiments including the oscillating lever being connected rigidly to the connecting pin; or including the oscillating lever being designed so as to have low torsion resistance along its longitudinal axis; or including the oscillating lever having a thin-walled, flat shape are also advantageous. A variable compression invariably causes torsional flexing of the connecting pin in relation to the vehicle superstructure. In this case, twisting is forced upon the oscillating levers, which are connected rigidly to the connecting pin on the one hand, and can exclusively perform a rotational movement crosswise to the direction of travel. Therefore, the oscillating levers should advantageously be designed so as to have low torsion resistance along the longitudinal axis, in order to be able to achieve the necessary axle compression. This may be achieved with a thin-walled, flat design of the oscillating levers. Lateral forces such as those acting on the vehicle with every change in direction would cause an inadmissible deformation of the oscillating levers crosswise to the direction of travel.\nThereby, an embodiment in which the rotation point of the oscillating lever lies on the front connecting pin facing in the direction of"} +{"output_text": " methods are based on the measurement of the electric current induced by the movement of the liquid in the channel. The electric current is proportional to the flow rate of the liquid.\nThe electric current is generally measured by means of electrodes placed at the free end of the channel. The electrodes are generally placed in a liquid, which is generally a conductive liquid, such as water.\nThe electric current is generally measured by means of electrodes placed at the free end of the channel. The electrodes are generally placed in", "input_text": " difficulty and these methods are sensitive to effects linked to high throughputs. Mention will be made, among others, of the liquid meniscus being disturbed by the movement of the receptacle or the violence of the dispensing, the formation of bubbles or, in the case of weight measurement, the inertia of the receptacle and of the volume dispensed.\nAnother solution consists in determining the volume drawn up or dispensed by measuring the time-profile of the liquid flow rate in the sampling tool.\nMany methods have been developed for measuring the flow rate of liquids flowing in tubes and channels of various sizes. These methods are based on various physical principles: heat transfer, mechanical, optical or electrical methods and, more precisely, magnetohydrodynamic or electrokinetic methods.\nThe thermal, mechanical and optical methods have the advantage of being independent of the electrical conductivity of the liquids, which may be different from one liquid to another. They have, on the other hand, the drawback of being technically complex to implement.\nIn addition, such systems are difficult to miniaturize. In systems for drawing up and dispensing small volumes, the liquid is generally drawn up from the bottom of a thin tube or a narrow-necked flask, and, once drawn up, is in the flared end of a cone or needle.\nMechanical, thermal or optical systems are generally too bulky to be placed directly at this location and must therefore be remote, thereby requiring an indirect measurement, via a liquid or air piston, of the volumetric flow rate at the top of the device.\nElectrical methods are strongly dependent on the conductivity of the solutions, but they are the easiest to put in place and implement, especially in a miniaturized format, which makes it possible to implement them near the free end of the aspirating/dispensing device that forms the input to the sampling instrument.\nElectrokinetic"} +{"output_text": ". It would be further desirable to be able to automatically create test programs directed at narrowed targets in a manner that allows more compact archival of data, while maintaining compatibility with existing test programs. It would be further desirable to be able to automatically create test programs directed at narrowed targets in a manner that allows more compact archival of data, while maintaining compatibility with existing test programs, and while allowing the reuse of existing test programs. It would be further desirable to be able to automatically create test programs directed at", "input_text": "., it is proposed to employ a test generator that automatically produces test programs based on a finite state machine model of the software. Limiting the number of test programs is achieved by controlling loop execution, and by appropriately setting the coverage level for the model, known as \u201ctransition cover testing\u201d. This approach seeks to specify during the test program generation process that each transition within the finite state machine model be exercised once. The generator is capable of specifying different coverage levels for selected portions of the program under test, so that critical portions might be exhaustively tested, while other portions receive less comprehensive testing.\nThere are several reasons for focusing test program generation. Some may be unanticipated during the development and implementation of the software specification. For example, the testing process may uncover programming defects. Such discovery may create the need to generate still more tests that work around the newly discovered defect in order to test unaffected parts of the software. Once the defect has been corrected, even more tests may need to be generated in order for verification. In practice, a supposedly corrected defect may surface again following subsequent program modification, or changes in the conditions of usage. Thus, it is desirable to repeatedly verify that the defect has not recurred.\nThe task has fallen to software engineers to revise test programs to accommodate incremental changes in the software program. As there is a cost in the generation of test models, engineers archive and reuse the products of the test generation process. While the archival technique is generally practical, maintaining compatible archived test programs has itself been proven costly. Furthermore, ad hoc practices of cataloging, finding, and retrieving combinations of test generation parameters are impractical. Because of the lack of alternatives, test engineers often are compelled to resort to archiving entire test suites, which is relatively costly.\nIt would be desirable to be able to automatically create test programs directed at narrowed targets in a manner that allows more compact archival of data"} +{"output_text": " Kronenberg et al. discloses a method for producing thin amorphous metal ribbons by rapidly quenching a molten metal alloy at a rate of at least 10.sup.6.degree. C./second. The alloy is quenched in a water-cooled copper block. The alloy is rapidly cooled by a water-cooled copper block. The alloy is rapidly cooled by a water-cooled copper block. The alloy is rapidly cooled by a water-cooled copper block. The alloy is", "input_text": "). Soft magnetic properties were reported by adding copper and niobium to iron-silicon-boron alloys. Such material currently has the name FINEMET.RTM. and reportedly has an ultrafine grain structure composed of bcc Fe solid solution. Desirable properties of FINEMET.RTM. are attributed to the bcc solid phase which contains boron and silicon. The general starting ingredients for producing such material, for example, technical ferroboron, niobium or ferroniobium, zirconium and copper are refined or semi-refined products and are quite expensive. In some cases, copper and niobium are added to the starting melts prior to quenching to an amorphous state at levels of 0.2-4.0 atomic percent each. Copper and niobium will form a molecular cluster that aids in the nucleation and control of the size of ferrite iron crystals, however, these materials, especially niobium, are very expensive and are a major drawback to further commercialization of these boron-stabilized nanocrystalline materials.\nTypically, amorphous metal alloys are produced by the very rapid cooling of a liquid metal alloy at approximately 10.sup.6.degree.C./second. The rapid cooling rate is required for the maintenance of the non-crystalline structure of the liquid alloy when it solidifies. Numerous methods are known for achieving this rapid cooling. One such technique employs rapid cooling at a moving cooled surface, such as a wheel or belt to produce thin wire strands, ribbons or other thin shapes. The thin structure may be laminated or wound to form a magnetic core, for example.\nAllied Signal's METGLAS.RTM. amorphous metal alloy is an industry standard having a thickness of from 20-23 microns. U.S. Res. Pat. No. 32,925 to"} +{"output_text": " opposite sides.\nThe invention-specific measures are particularly suitable for pumping solid state media, such as solid state media which are used as amplification media.\nThe invention-specific measures are particularly suitable for pumping solid state media, such as solid state media which are used as amplification media.\nThe invention-specific measures are particularly suitable for pumping solid state media, such as solid state media which are used as amplification media.\nThe invention-specific measures are particularly suitable for pumping solid state media, such", "input_text": " description when only partial pumping or excitation takes place.\nBecause of the small dimension in relation to the amount of the pumped volume, a small thermal lens effect is achieved by the specific measures according to the invention.\nAdditionally, only extremely low depolarization losses occur, since in this case there is a quasi-one-dimensional heat transfer. By means of the defined, rectangular volume excitation with pumping radiation, an effect on the beam quality can be attained. This is done by having the height of the pumped volume cross section designed so that it approaches the dimension of the ground mode (ground mode diameter), resulting in a higher attainable efficiency. These advantages are to be particularly cited in relation to solid bodies which are used as amplification media. Additionally, they are to be cited precisely when such solid state media are pumped with diode radiation, for it is precisely with solid state media that the invention-specific measures can implement such pumping geometries relatively simply and efficiently.\nWhat is preferred is an adjustment of the ratio of the maximum to the minimum cross sectional width of the amplification medium pumped volume, viewed as perpendicular to the optical axis of the amplification medium. This adjustment is done so that it amounts to less than 1:5. This means that the fluctuation width of the optically pumped zone in the amplification medium, by such means as by one or more constrictions, is kept within defined small limits.\nAdditionally, the relationship of width to height of the rectangular cross section of the pumped volume must be greater than 1.8, so that an elongated cross sectional volume is pumped in the amplification medium. In contrast to a rectangular-cross-section pumped volume, by this means an advantage is achieved in that a quasi-one-dimensional heat transfer is present. Connected with that is minimal depolarization loss.\nApproximation of the pumped volume to a rectangular cross section can be simplified by having the amplification medium pumped from two"} +{"output_text": " a single, large, high-speed, high-capacity disk drive. RAID technology is a method of storing data on multiple disk drives in a manner that allows the data to be reconstructed in the event of a disk drive failure. RAID technology is typically implemented in a file server by using a RAID controller to control the distribution of data across the disk drives.\nRAID technology is typically implemented in a file server by using a RAID controller to control the distribution of data across the disk", "input_text": " enhance the reliability and availability of client/file server communications and client/client file system communications. Yet other methods of the prior art utilize information redundancy to allow the recovery and reconstruction of transactions lost due to failures occurring during execution of the transactions. These methods include caching, transaction logging and mirroring wherein caching is the temporary storage of data in memory in the data flow path to and from the stable storage until the data transaction is committed to stable storage by transfer of the data into stable storage, that is, a disk drive, or read from stable storage and transferred to a recipient. Transaction logging, or journaling, temporarily stores information describing a data transaction, that is, the requested file server operation, until the data transaction is committed to stable storage, that is, completed in the file server, and allows lost data transactions to be re-constructed or re-executed from the stored information. Mirroring, in turn, is often used in conjunction with caching or transaction logging and is essentially the storing of a copy of the contents of a cache or transaction log in, for example, the memory or stable storage space of a separate processor as the cache or transaction log entries are generated in the file processor.\nThe use of multiple, duplicate parallel communications paths or multiple, duplicate parallel processing units, caching, transaction logging and mirroring, however, are often unsatisfactory because they are often costly in system resources and require complex administrative and synchronization operations and mechanisms to manage the caching, transaction logging and mirroring functions and subsequent transaction recovery operations, and significantly increase the file server latency, that is, the time required to complete a file transaction.\nOne of the most frequently used methods of the prior art for the preservation and recovery of data and file transactions is RAID technology, which is a family of industry standard methods for distributing redundant data and error correction information across a redundant array of disk drives that essentially operates as"} +{"output_text": ". The Internet has become a global network of interconnected networks, with commercial and non-commercial entities providing access to the Internet. The Internet has also become a popular vehicle for the distribution of commercial content, such as music, movies, and software.\nThe Internet has also become a popular vehicle for the distribution of non-commercial content, such as news, weather, and financial information. The Internet has also become a popular vehicle for the distribution of non-commercial content, such as news, weather,", "input_text": "Internet\u201d of worldwide interconnected networks. In the late 1980s and early 1990s, NSFNET usage grew dramatically, jumping from 85 million packets in January 1988 to 37 billion packets in September 1993. The capacity of the NSFNET backbone was upgraded to handle this additional demand, eventually reaching T3 (45 Mbps) speed.\nIn 1992, the NSF announced its intention to phase out federal support for the Internet backbone, and encouraged commercial entities to set up private backbones. Alternative backbones had already begun to develop because NSFNET's \u201cacceptable use\u201d policy, rooted in its academic and military background, ostensibly did not allow for the transport of commercial data. In the 1990s, the Internet has expanded decisively beyond universities and scientific sites to include businesses and individual users connecting through commercial ISPs and consumer online services.\nFederal support for the NSFNET backbone ended on Apr. 30, 1995. The NSF has, however, continued to provide funding to facilitate the transition of the Internet to a privately-operated network. The NSF supported the development of three priority Network Access Points (NAPs), in Northern California, Chicago, and New York, at which backbone providers could exchange traffic with each other, as well as a \u201crouting arbiter\u201d to facilitate traffic routing at these NAPs. The NSF funded the vBNS (Very High-Speed Backbone Network Service), a non-commercial research-oriented backbone operating at 155 megabits per second. The NSF provides transitional funding to the regional research and educational networks, as these networks are now required to pay commercial backbone providers rather than receiving free interconnection to NSFNET. Finally, the NSF also remains involved in certain Internet management functions, through activities such as its cooperative agreement with SAIC Network Solutions Inc. to manage aspects of Internet domain name registration.\nSince the termination of federal funding for the NSFNET backbone, the Internet has continued to evolve"} +{"output_text": " 1990, pp. 245\u2013255). See also Couch, T. L. (1980) \u201cMosquito Pathogenicity of Bacillus thuringiensis var. israelensis,\u201d Developments in Industrial Microbiology 22:61\u201376; Beegle, C. C., (1978) \u201cThe Lepidoptera Toxins: A Review,\u201d Ann. Rev. Microbiol. 32:649\u2013669.\nMore recently, new subspecies of B.t", "input_text": ", and recombinant DNA-based B.t. products have been produced and approved for use. In addition, with the use of genetic engineering techniques, new approaches for delivering these B.t. endotoxins to agricultural environments are under development, including the use of plants genetically engineered with endotoxin genes for insect resistance and the use of stabilized intact microbial cells as B.t. endotoxin delivery vehicles (Gaertner, F. H., L. Kim [1988] TIBTECH 6:S4\u2013S7). Thus, isolated B.t. endotoxin genes are becoming commercially valuable.\nCommercial use of B.t. pesticides was originally limited to a narrow range of lepidopteran (caterpillar) pests. Preparations of the spores and crystals of B. thuringiensis subsp. kurstaki have been used for many years as commercial insecticides for lepidopteran pests. For example, B. thuringiensis var. kurstaki HD-1 produces a crystalline \u03b4-endotoxin which is toxic to the larvae of a number of lepidopteran insects.\nIn recent years, however, investigators have discovered B.t. pesticides with specificities for a much broader range of pests. For example, other species of B.t., namely israelensis and tenebrionis (a.k.a. B.t. M-7, a.k.a. B.t. san diego), have been used commercially to control insects of the orders Diptera and Coleoptera, respectively (Gaertner, F. H. [1989] \u201cCellular Delivery Systems for Insecticidal Proteins: Living and Non-Living Microorganisms,\u201d in Controlled Delivery of Crop Protection Agents, R. M. Wilkins, ed., Taylor and Francis, New York and London,"} +{"output_text": " of PTH (1-34) in Rats,\u201d Journal of Controlled Release, Vol. 62, Issues 1-3, pp. 1-9 (January 2000), and \u201cThe Effect of Electroporation on Eontophoretic Delivery of PTH (1-34) in Rats,\u201d Journal of Controlled Release, Vol. 62, Issues 1-3, pp. 10-18 (January 2000).\nIn addition, various methods of delivering PTH-based agents via", "input_text": " fracture, or who have failed or are intolerant to previous osteoporosis therapy. In men with primary or hypogonadal osteoporosis, FORTEO\u2122 has similarly been found to increase BMD.\nIn addition to subcutaneous injection, other means of delivering PTH-based agents have also been investigated. For example, various pulmonary delivery (i.e., inhalation) methods are discussed in \u201cPulmonary Delivery of Drugs for Bone Disorders,\u201d Advanced Drug Delivery Reviews, Vol. 42, Issue 3, pp. 239-248 (Aug. 31, 2000), Patton, \u201cBioavailability of Pulmonary Delivered Peptides and Proteins: \u2014Interferon, Calcitonins and Parathyroid Hormones,\u201d Journal of Controlled Release, Vol. 28, Issues 1-3, pp. 79-85 (January 1994), Patton, et al., \u201cImpact of Formulation and Methods of Pulmonary Delivery on Absorption of Parathyroid Hormone (1-34) from Rat Lungs,\u201d Journal of Pharmaceutical Sciences, Vol. 93, Issue 5, pp. 1241-1252 (May 2004), Codrons, et al., \u201cSystemic Delivery of Parathyroid Hormone (1-34) Using Inhalation Dry Powders in Rats,\u201d Journal of Pharmaceutical Sciences, Vol. 92, Issue 5, pp. 938-950 (May 2003) and Pf\u00fctzner, A, et al., \u201cPilot Study with Technosphere/PTH (1-34)\u2014A New Approach for Effective Pulmonary Delivery of Parathyroid Hormone (1-34)\u201d, Horm. Metab. Res., Vol. 35(5), pp. 319-23.\nVarious methods of active transdermal delivery of PTH-based agents are also discussed in \u201cThe Effect of Electroporation on Eontophoretic Erans"} +{"output_text": ", and also performs a logical channel multiplexing for mapping several logical channels to one transport channel. The MAC layer is connected to an upper Radio Link Control (RLC) layer through a logical channel, and the logical channel is roughly divided into a control channel for transmitting control plane information and a traffic channel for transmitting user plane information according to the type of transmitted information.\nThe Radio Link Control (RLC) layer of the second layer segments and/or concatenates data received from an upper layer to", "input_text": "S-GW is called an S1-U interface, and both connections may be commonly called an S1 interface.\nA radio interface protocol is defined in the Uu interface which is a radio section, wherein the radio interface protocol is horizontally comprised of a physical layer, a data link layer, a network layer, and vertically classified into a user plane (U-plane) for user data transmission and a control plane (C-plane) for signaling transfer. Such a radio interface protocol can be typically classified into L1 (first layer) including a PHY layer which is a physical layer, L2 (second layer) including MAC/RLC/PDCP layers, and L3 (third layer) including a RRC layer as illustrated in FIGS. 2 and 3. Those layers exist as a pair in the UE and E-UTRAN, thereby performing data transmission of the Uu interface.\nFIGS. 2 and 3 are exemplary views illustrating the control plane and user plane architecture of a radio interface protocol between UE and E-UTRAN in an LTE system, which is a mobile communication system to which the related art and the present invention are applied.\nThe physical layer (PHY) which is a first layer provides information transfer services to the upper layers using a physical channel. The PHY layer is connected to the upper Medium Access Control (MAC) layer through a transport channel, and data between the MAC layer and the PHY layer is transferred through the transport channel. At this time, the transport channel is roughly divided into a dedicated transport channel and a common transport channel based on whether or not the channel is shared. Furthermore, data is transferred between different PHY layers, i.e., between PHY layers at the transmitter and receiver sides.\nVarious layers exist in the second layer. First, the Medium Access Control (MAC) layer serves to map various logical channels to various transport channels"} +{"output_text": " a single pass.\nThe arrangement described in EP 0190378A1 is not suitable for the alloying of small workpieces, because the energy input is not sufficient to alloy the workpiece.\nThe arrangement described in EP 0190378A1 is not suitable for the alloying of workpieces with a high melting point, because the energy input is not sufficient to alloy the workpiece.\nThe arrangement described in EP 0190378A1 is not suitable for the alloying of workpieces with", "input_text": "etzig, S. Nowotny: Induktiv unterst\u00fctztes Laserauftragschwei\u03b2en\u2014eine Hybridtechnologie \u00fcberwindet Anwendungsgrenzen, 6th European Conference on Laser Treatment of Materials, Stuttgart Sep. 16-18, 1996, conference materials p. 477-484). The inductive preheating has a particularly effective impact on the reduction of the temperature gradient and the reduction of the transient stresses thus possible, because the energy input occurs not only over the surface but at a depth that can be established by means of the induction frequency. Moreover, the specific energy provision costs for the coupled inductive energy are at least one order of magnitude lower than for laser energy.\nYoshiwara and Kawanami (Method for surface-alloying metal with a high-density energy beam and an alloy steel, EP 0190378A1) thus claim a process with which an inductor or an oxygen acetylene burner permanently connected to a laser beam focusing unit in feed direction before the laser-irradiated area act on the workpiece. The area thus preheated is larger than that subsequently irradiated. Accordingly, an unsteady preheating temperature field results with a maximum that is shifted somewhat towards the laser beam and runs before the temperature field produced by the laser beam. In addition to this, the same arrangement can additionally be arranged after the laser beam point, thus realizing a postheating. The amount of energy that is supplied by the second energy source should be a substantial part of the necessary total process energy, but remain less than the amount of energy provided by the laser beam. The arrangement described in EP 0190378A1 can preferably be used for very large workpieces. With it, using an oxygen acetylene burner (data for an inductive preheating are not given), it was possible to alloy an alloy claimed in the same patent in"} +{"output_text": "lasses are also available with auxiliary lenses that are designed to enhance the user's vision, such as reading glasses, bifocals, and the like.\nThe auxiliary lenses may be attached to the primary eyeglasses via a variety of methods. The most common method is to use a magnet to attach the auxiliary lenses to the primary eyeglasses. The magnet is typically embedded within the primary eyeglass frame front. The magnet is typically a strong magnet, such as a neodymium magnet", "input_text": "That is, conventionally, the threshold value was normally set to a low level, giving priority to maintenance of service; however, in this case, there was a high probability of connecting to radio base stations where reception conditions were not really very good. 1. Field of the Invention\nThe present invention relates to attachable eyeglasses and, in particular, to a universal method of manufacturing attachable supplemental or auxiliary lenses having a magnet for removable attaching to primary eyeglasses via magnets embedded or even encapsulated within the primary eyeglass frame front.\n2. Background Information\nA lasting trend in fashion eyewear has been the use of \u201cclip-on\u201d sunglasses. Clip-on sunglasses typically consist of auxiliary lenses with clip-like appearances that fit about the frames of the primary eyeglasses for attaching thereto. The clip-ons may be tinted or otherwise treated. Traditionally some frame manufacturers have offered clip-ons as an extra accessory, but not all eyeglass frames have corresponding clip-ons. When available the clip-on could be specially ordered for the customer or could be purchased as a set with the frames. Alternatively, aftermarket clip-ons are available, including slip-ins, flip-ups, fit-overs, fit-behinds, and many variations.\nThe clip-on sunglasses represent one of the most common auxiliary lenses that are coupled to primary eyeglasses, but are not the only form of auxiliary lenses. The auxiliary lenses may be designed to assist the user with a select purpose, such as enhance nighttime driving, increased magnification for select project (needlework, jewelry working or cleaning, etc.), \u201c3D\u201d lenses, computer glasses (blocking certain wavelengths to reduce eyestrain through prolonged computer usage), and the like. A user's primary glasses may be supplemented via auxiliary lenses for almost any purpose that lenses have been designed. Sung"} +{"output_text": ".\nThe inference problem is particularly acute in the context of databases, where the answers to queries are typically stored in a database table. The problem is also acute in the context of social networks, where the answers to queries are typically stored in a database table, and where the answers are typically stored in a database table in a format that is not readily accessible to the user.\nThe inference problem is also acute in the context of databases, where the answers to queries are typically stored in a database table", "input_text": " very difficult to form an isolating region having as large a width as 200.mu.m (such as the one under an aluminum bonding pad). If an isolating region having a large width is necessary, therefore, it is formed by a method as shown in FIG. 5. Here, after narrow isolating layers 107a, 107b and 107c have been formed in respective grooves, an insulating film (for instance a SiO.sub.2 film) is deposited and selectively photoetched to form an isolating region 107' having a large width.\nAlthough an isolating region having a large width can be formed by this method, the isolating region thus obtained is not flush with the isolated element region, that is, a difference in level is produced between the isolating region and the isolated element region. When using the selective oxidation process, one half of the isolating layer (field oxide layer) is buried in the semiconductor layer, but according to the method of FIG. 5 the entirety of the insulating film 107' constitutes the difference in level. In other words, FIG. 5, the difference in level is more than when using the selective oxidation process. This is a serious drawback when microlithography is required in the neighborhood of the wide isolating layer. The inference problem in databases (and in social networks too, in a slightly different guise) occurs when sensitive information is disclosed indirectly, via a series of ostensibly secure answers to queries. Even though each individual query answer may be properly authorized for disclosure (i.e., the user's clearance level may permit her to receive the answer), the answers may nevertheless collectively compromise sensitive information, in that the user may be able to infer from these answers information that she is not authorized to have, particularly when she combines the answers with some additional knowledge, e.g., metadata such as integrity constraints or functional dependencies, or domain-specific knowledge"} +{"output_text": " USA (May 1, 1993) 90:6149-6153), treating acute and chronic inflammation (U.S. Pat. No. 5,187,157), treating acute and chronic pain (U.S. Pat. No. 5,284,840), treating acute and chronic inflammation (U.S. Pat. No. 5,284,840), treating acute and chronic pain (U.S. Pat. No. 5,284,840), treating acute and chronic inflammation", "input_text": " Additionally, a number of studies have shown that translocation of FGF-2 or FGF-1 to the nucleus either in the absence or presence of their cognate receptors is involved in DNA synthesis, but specific FGF targets have not been identified (Hawker, et al., (1994) Am. J. Phys., 266:H107-20; Hawker, et al., (1994) In Vitro Cellular And Developmental Biology. Animal30A:653-63; Wiedlocha, et al. (1996) Mol. Cell. Biol, 16:270-280; Wiedlocha, et al., (1994) Cell, 76:1039-1051). FGF-1 and FGF-2 ligands have been detected in intracellular compartments. Both ligands have been proposed to have specific intracellular sites of action that include stimulation of DNA synthesis for FGF-1 and stimulation of ribosomal gene transcription for FGF-2. A receptor-independent role for FGF-1 has been proposed using an FGF-1-Diphtheria toxin conjugate, which allowed receptor-independent, cytoplasmic entry of FGF-1.\nThe evidence for the activity of FGF proteins in a variety of beneficial biological processes, combined with the evidence indicating an intracellular site of action and a potential direct role for FGF proteins in signal transduction affecting cell proliferation and differentiation, make FGF proteins a desirable candidate molecule for the development of modified proteins as regulators of cell growth and differentiation, for the use in applications such as promoting wound healing, treating myocardial infarction (Svet-Moldavsky, G. J., et al, Lancet (Apr. 23, 1977) 913; U.S. Pat. Nos. 4,296,100 and 4,378,347), treating degenerative neurological disorders, such as Alzheimer\"\"s disease and Parkinson\"\"s disease (Walicke, P., et al, Proc Natl Acad Sci"} +{"output_text": " of copper is deposited using a precursor that is inexpensive and easy to manufacture.\nIt would also be advantageous to provide a method of depositing copper on diffusion barrier material using CVD to improve conformality, while also improving the adhesion between the copper and the barrier material.\nIt would also be advantageous to provide a method of depositing copper on diffusion barrier material using CVD to improve conformality, while also improving the adhesion between the copper and the barrier material.\nIt would also be advantageous to provide a", "input_text": " that has been expended on CVD, two major obstacles remain before a CVD copper process can be adopted in manufacturing. These two critical hurdles are (i) high cost of ownership (COO) for the CVD process and (ii) reliable adhesion to barriers. The presently available MOCVD processes and precursors do not satisfactorily fulfill both these criteria simultaneously without compromising film and process attributes. Since the precursor cost is a major contributor ( greater than 65%) to the overall COO of the CVD process, precursors that can be inexpensively manufactured are preferred. However, precursor costs have to be lowered without compromising film properties. For instance, reliable and repeatable adhesion has to be achieved while simultaneously maintaining low via and contact resistance low, high deposition rate, high conformality as well as low cost of the precursor. Many IC manufacturers have employed a PVD Cu seed layer followed by a CVD Cu fill in order to achieve adequate film properties. The use of a PVD Cu seed layer underscores the difficulty in achieving low contact resistance and reliable adhesion on barriers (TiN or TaN) by a CVD process alone.\nAs the size of features on ICs continues to shrink, it is desirable to continue developing improvements in the adhesion of CVD copper to barrier materials as a replacement for PVD, which is unsuitable for metallizing the smallest features.\nIt would be advantageous to provide a method of improving the adherence of copper metallization to diffusion barrier material without the sacrifice in conformality associated with PVD copper.\nIt would also be advantageous to provide a method of depositing copper on diffusion barrier material using chemical vapor deposition (CVD) to improve conformality, while also improving the adhesion between the copper and the barrier material.\nIn addition, it would be advantageous to discover a method adhering a thin seed layer of copper to surfaces of diffusion barrier material using high conformality CVD, wherein the thin seed layer"} +{"output_text": " purposes. For example, amylase enzymes have been used in the starch processing industry to hydrolyze starch to dextrins and sugars, in the textile industry to hydrolyze starches, in the brewing industry to hydrolyze malt, in the detergent industry to hydrolyze proteins, in the oil industry to hydrolyze starches, in the pharmaceutical industry to hydrolyze proteins, in the food processing industry to hydrolyze starches, in the leather industry to hydrolyze starches, in the paper industry to", "input_text": " to update at least one of a group of queues selected from the local request queue, the local reply queue, the remote request queue and the remote reply queue.\nLike the Local Queue Manager, the Remote Queue Manager manages a set of queues, wherein the set of queues comprises the remote request queue, the remote reply queue, a remote dead-letter queue, and a remote transmission queue and wherein the Remote Queue Manager uses the remote transmission queue to provide a remote address for communicating with the local request queue and the local reply queue.\nThe Remote Queue Manager communicates information regarding the updated queue to the remote destination application on the remote system in response to at least one command originating from the remote destination application.\nAlternatively, the present invention provides a CORBA-based Messaging API framework for communicating between a CORBA-based application and an MQ Series Messaging System. The framework includes a set of application programming interfaces for interfacing with an MQ Series Messaging system encapsulated by the Messaging API framework, wherein the set of application programming interfaces includes a poll request API, a receive reply API, a send reply API, a send request API, and a send request and block for reply API. The Messaging API framework receives commands communicated by the CORBA-based application, using C++ code selected from the group of commands which comprises poll request, receive reply,, send reply, send request, and send request and block for reply. The Messaging API framework uses MQ Series compliant language to communicate such commands to the MQ Series Messaging System. The Messaging API framework uses MQ Series compliant language to receive responses to such commands from the MQ Series Messaging System. The Messaging API framework communicates the responses to the CORBA-based application..alpha.-Amylase enzymes have been used industrially for a number of years and for a variety of"} +{"output_text": " The 7th International Conference on Cardiac Pacing, Mar. 11-15, 1996, and in U.S. Pat. Nos. 5,131,388; 5,131,390; 5,131,391; 5,131,392; 5,131,393; 5,131,394; 5,131,395; 5,131,396; 5,131,397; 5,131,398; 5,131,399; 5,133,499", "input_text": " the prior art for general applications as well as for use in ICDs. More recently developed ICD IPGs employ one or more flat high voltage capacitor to overcome some of the packaging and volume disadvantages associated with cylindrical photoflash capacitors. For example, U.S. Pat. No. 5,131,388 discloses a flat capacitor having a plurality of stacked capacitor layers. Each capacitor layer contains one or more anode foil sheet forming an anode layer having an anode tab, a cathode sheet or layer having a cathode tab and a separator for separating the anode layer from the cathode layer. In the \"\"388 patent, the electrode stack assembly of stacked capacitor layers is encased within a non-conductive, polymer envelope that is sealed at its seams and fitted into a chamber of a conductive metal, capacitor case or into a compartment of the ICD IPG housing, and electrical connections with the capacitor anode(s) and cathode(s) are made through feedthroughs extending through the case or compartment wall. The tabs of the anode layers and the cathode layers of all of the capacitor layers of the stack are electrically connected in parallel to form a single capacitor or grouped to form a plurality of capacitors. The aluminum anode layer tabs are gathered together and electrically connected to a feedthrough pin of an anode feedthrough extending through the case or compartment wall. The aluminum cathode layer tabs are gathered together and electrically connected to a feedthrough pin of a cathode feedthrough extending through the case or compartment wall or connected to the electrically conductive capacitor case wall.\nMany improvements in the design of flat aluminum electrolytic capacitors for use in ICD IPGs have been disclosed, e.g., those improvements described in xe2x80x9cHigh Energy Density Capacitors for Implantable Defibrillatorsxe2x80x9d presented by P. Lunsmann and D. MacFarlane at CARTS 96:"} +{"output_text": " cooling apparatus further includes a heat exchanger for cooling the brine flowing through the evaporator, a refrigerant evaporated by the heat exchanger, and a pipe for connecting the heat exchanger and the screw compressor.\nIn accordance with the present invention, there is provided a brine cooling apparatus including a screw compressor, a condenser, a main expansion valve, an evaporator, a pipe for connecting the screw compressor, the condenser, the main expansion valve and the evaporator, a refrigerant evaporated by", "input_text": " in an evaporator.\nSince a large amount of refrigerant is required in the flooded type and liquid circulating type cooling apparatuses in accordance with the prior art, they do not address the problems of the ozone layer breakage and global warming, and it is necessary to sufficiently consider an efficiency, a risk and the like in the case of employing ammonia.\nFurther, in the case of using the plate type heat exchanger, it is necessary to consider a risk that an internal freezing is generated when a flow rate of the brine is reduced and a heat transmitting pipe forming the heat exchanger is clogged so as to be deformed or broken.\nAn object of the present invention is to provide a brine cooling apparatus which can solve the problems mentioned above, prevent a brine from freezing within a heat exchanger, improve reliability and secure a stable operation.\nFurther, another object of the present invention is to provide a brine cooling apparatus which addresses an environmental problem by reducing an amount of used refrigerant, reducing a fear of breaking the ozone layer and preventing global warming.\nStill further, another object of the present invention is to provide a brine cooling apparatus which can secure an improvement in performance with a reduced amount of a refrigerant, provide an improved efficiency even when employing a natural type refrigerant, and increase safety with respect to combustibility and a poison of the natural type refrigerant.\nHere, the present invention is constituted such as to solve at least one of the problems mentioned above.\nIn order to achieve the objects mentioned above, in accordance with the present invention, there is provided a brine cooling apparatus including a screw compressor, a condenser, a main expansion valve, an evaporator, a pipe for connecting the screw compressor, the condenser, the main expansion valve and the evaporator, a refrigerant evaporated by the evaporator, and brine flowing through the evaporator. The brine"} +{"output_text": " for preionizing the working gas, the invention is characterized in that the high-voltage module is provided with a high-voltage pulse generator which is connected to the first electrode housing and the second electrode housing and which is provided with a high-voltage pulse generator which is connected to the first electrode housing and the second electrode housing and which is provided with a high-voltage pulse generator which is connected to the first electrode housing and the second electrode housing and which is provided with a high-voltage pulse generator", "input_text": " such as are required in chip fabrication for exposure machines with output powers of several hundred watts. With the greatest possible conversion efficiency that can be achieved for a plasma generated by gas discharge estimated at about 1%/2xcfx80xc2x7sr, an input power of 20 kW would be required to collect 100-watt EUV radiation in a solid angle of xcfx80sr. Further, it must be kept in mind that the majority of this enormous power for converting into plasma must be transmitted over discharge surfaces of a few square centimeters. It can easily be imagined that these small surfaces will not be stable over a long duration, so that radiation sources based on a gas discharge appear unsuitable for stable long-term use due to the fact that they must work in continuous operation for upwards of at least twenty hours and more at repetition frequencies of between 2 and 10 kHz for commercial use in chip lithography.\nTherefore, it is the primary object of the invention to find a novel possibility for the realization of an EUV radiation source which achieves a high average radiation output in the EUV region and remains stable for a sufficiently long period of time.\nAccording to the invention, in a radiation source for the generation of extreme ultraviolet (EUV) radiation based on a dense, hot plasma generated by gas discharge containing two electrodes which are electrically separated from one another by insulators which are resistant to breakdown and at the same time form rotationally symmetric electrode housings for parts of a vacuum chamber, wherein a gas discharge for plasma generation is provided between a first electrode housing and a second electrode housing within the vacuum chamber and an exit or outlet opening for the radiation emitted by the plasma is provided in the first electrode housing, further containing a gas supply unit for generating a flow of working gas through the vacuum chamber, a high-voltage module for providing high-voltage pulses at the electrodes and a preionization unit"} +{"output_text": "aded fuels was not a good predictor of performance.\nThe FAA researcher also reported that the unleaded blends tested were not as effective as the 100LL fuel in reducing the formation of nitrous oxides (NOx) in the exhaust of the engine. The researcher reported that the unleaded blends tested were not as effective as the 100LL fuel in reducing the formation of nitrous oxides (NOx) in the exhaust of the engine. The researcher reported that the unleaded blends tested were not as effective", "input_text": " aviation unleaded fuel blends earlier tested, another matrix of 47 unleaded fuel blends was developed and detonation tested in a Lycoming IO-540-K aircraft engine at the FAA William J. Hughes Technical Center in Atlantic City, N.J. Components of such blends included varying ranges of \u201chigh octane components\u201d such as aviation alkylate, super alkylate, toluene, ethyl tertiary butyl ether (ETBE), meta-toluidine, tert-butylbenzene. The blends contained iso-pentane for volatility control. Comprehensive blend formulations, by both volume fractions and mass fractions of those fuel blends were reported in Tables 2, 3, 4, and 5 of that report. The blends with a target range of 97.6 to 106.3 MON were tested against a baseline leaded reference fuel that met all specifications of ASTM D910 for Grade 100LL fuel with minimum MON and minimum performance number (PN) per ASTM D-909. The blends were also tested against a 100LL aircraft fuel purchased at the local airport. Here, the FAA researcher reported that none of the unleaded blends of equivalent or lower MON performed as well as the Grade 100LL fuel in the detonation tests, particularly as seen when operated on full scale engines rather than the laboratory test engines used to establish the ASTM D-2700 MON and the ASTM D-909 rich rating performance number. It was also demonstrated that increased fuel flow of the unleaded blends was required above the fuel flow required for 100LL in order to achieve equivalent detonation performance. In short, the tested blends provided less detonation protection than leaded formulations of equivalent MON, and appeared to potentially be less efficient. Importantly, the researcher again reported that using only motor octane number (MON) based on ASTM D-2700 (for knock rating, lean mixture) to predict full scale engine performance of unle"} +{"output_text": ", or a logo. The metal layer is vapor deposited on the indicium to a thickness of about 0.1 to about 0.5 microns. The metal layer is then heated to a temperature of about 200 to about 400xc2x0 C. to vaporize the metal and form a vaporized metal layer. The vaporized metal layer is then cooled to a temperature of about 100 to about 200xc2x0 C. to form a metal layer having a surface that overlies the", "input_text": " For example, opaque coloring agents can be rendered transparent to reveal underlying indicia, or similar agents can change from one color to another to indicate a change.\nChemical transformations in irreversible displays are sometimes used for security purposes to provide evidence of tampering or counterfeiting. U.S. Pat. No. 4,488,646 to McCorkle hides a warning message behind a solvent-sensitive blush coating to provide evidence of solvent tampering with letters, tickets, and other information-bearing constructions. Upon exposure to a wide range of aromatic or aliphatic solvents, the blush coating is transformed into a transparent state revealing the message. U.S. Pat. No. 4,903,991 to Wright discloses a document security system in which a latent image is developed by rupturing photoactive microcapsules to verify authenticity.\nMechanical transformations are more often used for interactive game pieces. The most common are scratch-off games in which an opaque coating is removed by abrasion to reveal a hidden indicium. Chang et al. in U.S. Pat. No. 5,431,452 separately position a latent image and a removable image-developing device on different portions of a substrate. The image-developing device contains a chromogenic composition that converts the latent image into a visible image.\nOur irreversible displays exploit features of thin metal films, especially vapor deposited films, for such purposes as temporarily obscuring predetermined indicia from view and subsequently reacting with chemical clearing agents to reveal the predetermined indicia. The thin metal films can be cleared away to reveal underlying indicia, or the indicia can also be formed by clearing the films in predetermined patterns. The clearing process is visually engaging as a preferably lustrous metal progressively disappears.\nOne example of our irreversible display includes a metal layer having a surface that overlies an indicium, such as a contrasting color, a pattern"} +{"output_text": "B\"\"s), and configurable interconnects. The interconnects are programmable to selectively connect the CLB\"\"s to the IOB\"\"s. The interconnects can be programmed to form any of a number of different interconnect topologies in response to CLB to IOB mapping information. The interconnects can also be programmed to form a ring around the CLB\"\"s. The interconnects can also be programmed to form a mesh. The interconnects can also be programmed to form a torus.", "input_text": " holding permanent information about the database as a whole or about a particular data set. It also acts as a placeholder for information that can be derived from the database.\nOne of the most significant options in DASDL (Data And Structure Definition Language) is that it is possible to define the database as to whether the database is to be audited. The data management system supports both logging changes to a database (auditing the database) or not logging changes (maintaining an unaudited database). There are advantages in auditing a database since this assures the user that if a database failure occurs, there will be a record of database changes with which one can restore the database to a completely integral state and thus avoid loss of information and corruption of information. 1. Field of the Invention\nThis invention is related to integrated circuits formed on a semiconductor substrate. More particularly, this invention is related to integrated circuits having multiple selectable functions. These functions are selectable during operation by xe2x80x9csoftwarexe2x80x9d programming.\n1. Description of the Related Art\nThe structures of a field programmable gate array (FPGA) and programmed logic devices (PLD) are well known in the art. An FPGA and PLD each have configurable logic blocks (CLB) that will perform a Boolean logic operation on a group of input signals to perform a single complex logical function. The configurable logic blocks are then interconnected to form even more complex logic structures. The interconnection between the configurable logic blocks may be created by physically destroying fuses to break undesired connections or by activating pass transistors between wiring segments routed on the semiconductor substrate.\nU.S. Pat. No. 5,740,069 (Agrawal et al.) describes a programmable integrated circuit that includes configurable logic blocks (CLB\"\"s), configurable input/output blocks (IO"} +{"output_text": " electronics chamber below the maximum operating temperature of the electronics.\nThe thermal insulator flasks have been placed in the electronics chamber to retard heat transfer from the bore hole into the downhole tool and into the electronics chamber. The thermal insulator flasks have been placed in the electronics chamber to retard heat transfer from the bore hole into the downhole tool and into the electronics chamber. The thermal insulator flasks have been placed in the electronics chamber to retard heat transfer from the bore hole into the downhole tool and", "input_text": " a need to reduce the temperature within the downhole tool in the region containing the electronics, to the within the safe operating level of the electronics. Various schemes have been attempted to resolve the temperature differential problem to keep the tool temperature below the maximum electronic operating temperature, but none of the known techniques have proven satisfactory.\nDownhole tools are exposed to tremendous thermal strain. The downhole tool housing is in direct thermal contact with the bore hole drilling fluids and conducts heat from the bore hole drilling fluid into the downhole tool housing. Conduction of heat into the tool housing raises the ambient temperature inside of the electronics chamber. Thus, the thermal load on a non-insulated downhole tool's electronic system is enormous and can lead to electronic failure. Electronic failure is time consuming and expensive. In the event of electronic failure, downhole operations must be interrupted while the downhole tool is removed from deployment and repaired. Thus, various methods have been employed in an attempt to reduce the thermal load on all the components, including the electronics and sensors inside of the downhole tool. To reduce the thermal load, downhole tool designers have tried surrounding electronics with thermal insulators or placed the electronics in a vacuum flask. Such attempts at thermal load reduction, while partially successful, have proven problematic in part because of heat conducted from outside the electronics chamber and into the electronics flask via the feed-through wires connected to the electronics. Moreover, heat generated by the electronics trapped inside of the flask also raises the ambient operating temperature.\nTypically, the electronic insulator flasks have utilized high thermal capacity materials to insulate the electronics to retard heat transfer from the bore hole into the downhole tool and into the electronics chamber. Designers place insulators adjacent to the electronics to retard the increase in temperature caused by heat entering the flask and heat generated within the flask by the electronics. The design goal is to keep the ambient temperature inside of the"} +{"output_text": " analytical methods are based on the use of a solid phase which is coated with a material which is capable of binding to the target. The target can be removed by washing the solid phase with a solvent which is not damaging to the solid phase.\nThe solid phase can be a porous material such as a membrane or a porous bead. The solid phase can also be a porous material coated with a material which is capable of binding to the target. The solid phase can be a porous material coated with a material", "input_text": " cell surface molecules for which antibodies can be obtained. This can also be combined with size and density measurements. Methods for affinity purification of cells include \"Fluorescence Activated Flow Cytometry\" which can combine size with the existence of one or more cell surface markers. The problems of cell purification are severe due to their fragility and if antibody selection has been used the selected cells have to be used with the label still attached.\nMany areas of science use natural affinities for binding. In genetics complementarity of nucleic acids is commonly utilised as the basis of a method of analysis. For example mRNA is isolated by virtue of the fact that it always has a tail of adenine nucleotides at the end which can be bound to a row of thymidine nucleotides.\nSpecific gene nucleotide sequences can be captured by the complementary nucleotide sequence. These hybrids can usually be removed very easily by reducing the ionic strength of the elutant allowing the natural charge-driven repulsion between DNA strands to take effect.\nAs mentioned earlier, sometimes elution conditions cannot be found to remove without damage. This can be turned into an advantage however if the capture is done in the situation where it can act as the linker immobilising the desired activity in place so that subsequent steps can be performed in situ. For example this could be an enzyme reaction in the new science of biotransformations which uses immobilised enzymes for chemical synthesis.\nIf it is essential to remove the target material then a final resort is to use a membrane material which is itself soluble in a solvent not damaging to the target. This case however does release the capturing material also.\nFrequently the coupling of the material to its solid phase would be by covalent linker to avoid any problems of the leaching out of the capture moiety leading to contamination of the process.\nIf removal of the target moiety is difficult or unnecessary, analytical work can be done in situ. Many"} +{"output_text": " apparatus. The buckle chute includes a channel having a first end and a second end. The first end is adapted to receive a sheet. The second end is adapted to receive a stop. The stop is movable between a first position and a second position. The first position is adapted to stop the sheet at a first distance from the second end. The second position is adapted to stop the sheet at a second distance from the second end. The second distance is greater than the first distance.\nThe", "input_text": ", U.S. Pat. No. 4,701,233 (Beck et al.) discloses a method of folding a sheet by bulging a portion of the sheet and then folding the bulged portion through a roller nip. U.S. Pat. No. 4,875,965 (Marzullo) discloses a folding apparatus wherein a buckle chute is used for stopping a sheet, causing the sheet to enter a roller nip for folding. U.S. Pat. No. 4,944,131 (Gough) also discloses a folding apparatus having a buckle chute. In general, the sheet is allowed to enter into a channel of the buckle chute until the leading edge of the sheet is stopped by a stop. The leading edge stays in contact with the stop while the bulged portion is moved toward the roller nip for making a folded edge. The distance between the folded edge and the leading edge is usually adjustable. In the past, the buckle chute must first be removed from the folding machine and then the stop must be manually adjusted so as to change the distance between the folded edge and the leading edge. A series of markings indicating the folding distance are provided on the buckle chute to aid the positioning of the stop. While the folding distance can be roughly estimated by the markings, a precise folding distance is difficult to achieve. To obtain the correct folding distance, one may have to taking the buckle chute out of the folding machine a number of times to adjust the stop.\nIt is advantageous and desirable to provide a buckle chute wherein the folding distance can be precisely adjusted without the need of removing the buckle chute from the folding machine.\nThe present invention is concerned with a buckle chute having a front side, a back side, a left side and a right side for use in a sheet folding"} +{"output_text": " principal focal planes of the alignment marks and the surface of the wafer, so that the detection optical system cannot be disposed at a position where the detection optical system is not interfered with the alignment marks.\nIn the conventional alignment apparatus, therefore, the detection optical system is disposed at a position where the detection optical system is not interfered with the alignment marks.\nIn the conventional alignment apparatus, however, the detection optical system is disposed at a position where the detection optical system is interfered with the", "input_text": " slant. An objective lens is disposed at an angle with the surface of the LFZPs on the opposite side with respect to the normal of LFZP. The lens has such an axial chromatic aberration that focal planes of the lens to light rays of a plurality of wavelengths agree with principal focal planes of the LFZPs by Fresnel diffraction to the plural wavelength rays which planes are different in position with each of the light rays, respectively. The Fresnel diffraction images on the principal focal planes of the LFZPs which planes are different in position to each of the plural wavelength rays incident upon the LFZP, are respectively focused through the objective lens on the same image forming planes of the objective lens in superposing relationship into a straight line. The Fresnel diffraction image in a straight line is converted into an electric signal by a linear sensor disposed on the image forming plane by a linear scanning operation in the direction perpendicular to the longitudinal direction of the image. A cylindrical lens is disposed between the objective lens and the linear sensor such that the Fresnel diffraction image in a straight line is compressed in the longitudinal direction thereof and is formed on the linear sensor. Then, a signal thus obtained by the linear sensor is handled to detect a relative position between the alignment marks on the LFZPs.\nThe principal focal plane (62) of the LFZP, however, as shown in FIG. 1, is parallel to the surface of LFZP, so that the principal focal planes of the alignment marks are parallel to the surface of the mask and wafer.\nWith such arrangement, to make a positional alignment servo operation possible during X ray exposure, it is necessary to enlarge a detection angle of a detection optical system disposed downstream of the objective lens with respect to the normal of the wafer (mask) plane. A large detection angle, however, deviates largely from an angle of 90.degree. between the"} +{"output_text": " by a grooved die. The grooved die may be a grooved roller or a grooved mandrel. The grooved roller may be a grooved roller or a grooved mandrel. The grooved roller may be a grooved roller or a grooved mandrel. The grooved mandrel may be a grooved mandrel or a grooved roller. The grooved roller may be a grooved roller or a grooved mandrel. The grooved mandrel may be a", "input_text": " up the bundle of fibers, is one thousandth of an inch in diameter. Therefore, a bundle of two hundred fibers has a diameter of approximately 0.05 inch. The final post peg may therefore be also 0.05 inch in diameter, including approximately 200 fibers plus the saturation of the epoxy binder with an optional colorant/opaquer mixed into the epoxy resin to modify and change these properties.\nAs an alternative to adding an opaquer mix into the epoxy resin, one or more metal fibers or wires at or near the center of the fiber bundle can be used. This would have the added advantage of providing a ready means to remove the post (if this were necessary) by the following method. The single centrally located wire or fiber can be pulled out leaving a pilot hole for guidance of a reamer to facilitate removal.\nA preferred embodiment for an epoxy resin in MASTER BOND(copyright) Polymer System EP21LV of Master Bond, Inc. of Hackensack, N.J. MASTER BOND(copyright) is a two component, low viscosity epoxy resin in which the fibers are cast. The rigidity of MASTER BOND(copyright) can be adjusted by adjusting the mix ratio of the two components. Other useful resins include polyester resins or vinyl ester resins. Depending upon the adjustment of the epoxy resin, the number of fibers can vary.\nPreferably the bundle of fibers have a rounded end and may also have a tapered end with an optional continuous groove or facet of 50 to 100 micron depths to increase |surface texturing. The standard length of the post is about \u215d inch and the standard diameter is about 0.04 inch to 0.05 inch, with an optional taper at the top with xe2x85x9 inch linearly. The texturing may be by a die drawn across linearly or axially of 50 to 100 micron depth or it may be"} +{"output_text": "ically located at the rear of a forklift truck to minimize the distance the forklift must travel to unload the hopper. This is particularly important when the hopper is located in a confined space such as a warehouse or a shipping container.\nThe self-dumping hopper is typically mounted on the forklift truck by a mounting bracket that is attached to the forklift truck frame. The mounting bracket is typically attached to the hopper by a plurality of bolts that extend", "input_text": " across the whole gratings. 1. Field of the Invention\nThis invention relates to a self dumping hopper that permits safe operation of the hopper by a single operator.\n2. Description of Related Art\nSelf dumping hoppers are typically used in industrial settings to hold and contain waste, finished materials, raw materials and/or other bulk materials or products that require loading and unloading through a dumping operation. As such, self dumping hoppers typically operate between a latched position, such as shown in FIG. 1, and a dumping position, such as shown in FIG. 2.\nThe self dumping hopper is typically supported on the tines of a forklift truck which enter the fork pockets in the base in the same fashion that a pallet is handled. When unlatched, gravity maintains the hopper on a gear-like track that causes it to both dump and move forward in synchronization. When the base and track are level, an empty or uniformly loaded hopper will tend to rotate forward which causes its center of gravity to move even further forward to accelerate the dumping action. The forward tilt capability of the forklift mast may be used to move the center of gravity of an empty or loaded hopper in a forward direction; conversely, rear tilt moves the center of gravity rearward which diminishes the tendency of the hopper to dump and usually urges it into a rearward rest position against its stops.\nWhen the hopper is in its latched position it may be used for loading, storing, or transporting lading. The latched position is maintained by a latching system that is typified in FIG. 3. The latch is comprised of a hopper mounted latch pin, a frame mounted hook which is spring loaded to engage the pin, and a handle integral with the hook for manual disengagement.\nNormally, self-dumping hoppers are strateg"} +{"output_text": ", the specificity of hybridization is limited by the degree of complementarity between the probe or primer and the target sequence. In addition, the presence of secondary structures in the target sequence can also limit the specificity of hybridization.\nThe present invention provides a method for the synthesis of oligonucleotides having a modified sugar moiety. The method comprises the steps of: (a) reacting a nucleoside having a 3\u2032-OH group with a nucleoside having a 2\u2032-O-protected hydroxyl group, wherein the 2\u2032-O", "input_text": " al. (1994) J. Am. Chem. Soc. 116:3143-3144) and peptide (Nielsen et al. (1994) Bioconjugate Chem. 5:3-7) or guanidine (Dempcy et al. (1995) Proc. Natl. Acad. Sci USA, 92:6097-6101) linkages, have been shown to enhance hybrid stability. However, such modified oligonucleotides are non-extendible, because they lack a 3\u2032-OH group, and are therefore unable to serve as primers. Other hybrid-stabilizing modifications that have not been investigated with respect to their ability to support primer extension are 2\u2032-modified sugars (Monia et al. (1993) J. Biol. Chem. 268:14514-14522; Sproat et al. (1993) In Crooke, S. T. and Lebleu, B. (eds), Antisense Research and Applications. CRC Press, Boca Raton, Fla., pp. 352-362), conjugated intercalating agents (Asseline et al. (1984) Proc. Natl. Acad. Sci. USA 81:3297-3301) and substituted bases such as 2-aminoadenine (Lanun et al. (1991) Nucleic Acids Res. 19:3193-3198) or C5 propynyl pyrimidines (Wagner et al. (1993) Science 260:1510-1513). Thus, the need remains for a method of modifying short oligonucleotides so that they form more stable hybrids, such that the modification will not interfere with the ability of the oligonucleotides to serve as primers.\nA further shortcoming in the use of oligonucleotides as probes and primers is the difficulty of obtaining specificity such as single nucleotide mismatch discrimination using oligonucleotide probes and/or primers. In many cases"} +{"output_text": " halogen, and Rf is H, alkyl, aryl, arylalkyl or alkoxy.\nThe term xe2x80x9cprodrug estersxe2x80x9d as employed herein also includes prodrug esters which are known in the art for carboxylic acid esters such as methyl, ethyl, benzyl and the like. Other prodrug ester examples of R4 include the following groups:\n(1-alkanoyloxy)alkyl such as, \nwherein Ra", "input_text": "The term xe2x80x9cpolyhaloalkylxe2x80x9d as used herein refers to an xe2x80x9calkylxe2x80x9d group as defined above which includes from 2 to 9, preferably from 2 to 5, halo substituents, such as F or Cl, preferably F, such as CF3CH2, CF3 or CF3CF2CH2.\nThe term xe2x80x9cpolyhaloalkyloxyxe2x80x9d as used herein refers to an xe2x80x9calkoxyxe2x80x9d or xe2x80x9calkyloxyxe2x80x9d group as defined above which includes from 2 to 9, preferably from 2 to 5, halo substituents, such as F or Cl, preferably F, such as CF3CH2O, CF3O or CF3CF2CH2O.\nThe term xe2x80x9cprodrug estersxe2x80x9d as employed herein includes prodrug esters which are known in the art for carboxylic and phosphorus acid esters such as methyl, ethyl, benzyl and the like. Other prodrug ester examples of R4 include the following groups:\n(1-alkanoyloxy)alkyl such as, \nwherein Ra, Rb and Rc are H, alkyl, aryl or arylalkyl; however, RaO cannot be HO.\nExamples of such prodrug esters R4 include \nOther examples of suitable prodrug esters R4 include \nwherein Ra can be H, alkyl (such as methyl or t-butyl), arylalkyl (such as benzyl) or aryl (such as phenyl); Rd is H, alkyl, halogen or alkoxy, Re is alkyl, aryl, arylalkyl or"} +{"output_text": ". Exemplary cellular telephone systems configured substantially in accordance with the use of the IS-95 standard are described in U.S. Pat. Nos. 5,103,459 and 4,901,307, which are assigned to the assignee of the present invention and fully incorporated herein by reference. An exemplary system utilizing CDMA techniques is the cdma2000 ITU-R Radio Transmission Technology (RTT) Candidate Submission (referred to herein as cdma2000), issued by", "input_text": "; rotating the rotor; directing light to the rotating rotor; receiving light reflected from an optically reflective surface of the rotor with a photodiode; converting the received light to electricity; transforming the electricity to a higher voltage; rectifying the higher voltage electricity; and storing the higher voltage electricity in a capacitor.\nFurther objects, features and advantages of the invention will become apparent from the following detailed description taken in conjunction with the following drawings. 1. Field\nThe present invention relates to communication systems, and more particularly, to the transmission of wideband signals in communication systems.\n2. Background\nThe field of wireless communications has many applications including, e.g., cordless telephones, paging, wireless local loops, personal digital assistants (PDAs), Internet telephony, and satellite communication systems. A particularly important application is cellular telephone systems for remote subscribers. As used herein, the term \u201ccellular\u201d system encompasses systems using either cellular or personal communications services (PCS) frequencies. Various over-the-air interfaces have been developed for such cellular telephone systems including, e.g., frequency division multiple access (FDMA), time division multiple access (TDMA), and code division multiple access (CDMA). In connection therewith, various domestic and international standards have been established including, e.g., Advanced Mobile Phone Service (AMPS), Global System for Mobile (GSM), and Interim Standard 95 (IS-95). IS-95 and its derivatives, IS-95A, IS-95B, ANSI J-STD-008 (often referred to collectively herein as IS-95), and proposed high-data-rate systems are promulgated by the Telecommunication Industry Association (TIA) and other well known standards bodies.\nCellular telephone systems configured in accordance with the use of the IS-95 standard employ CDMA signal processing techniques to provide highly efficient and robust cellular telephone service"} +{"output_text": " supported, such that the rack is movable between a position where the rack is placed on the ground plane and a position where the rack is retracted from the ground plane.\nAccording to this preferred embodiment, the auxiliary wheel is arranged in the vicinity of the folding mechanism such that the auxiliary wheel can be placed on the ground together with the front wheel and the rear wheel in the folded state of the folding bicycle, and the rack is arranged above the rear wheel, so that the auxiliary wheel and the rack", "input_text": " region by the axle shift mechanism, thereby releasing engagement between the rear wheel-side sprocket and the sprocket chain, so that it is possible to prevent the rear wheel-side sprocket from being rotated with rotation of the rear wheel, thereby allowing the folding bicycle to be moved by rotation of the wheels.\nPreferably, the folding bicycle includes a front wheel-securing mechanism that rotatably secures the front wheel in a folded state of the folding bicycle, such that lines of intersection where a plane of rotation of the front wheel defined as an imaginary geometrical plane which contains a diameter of the front wheel and to which a rotational axis of the front wheel is perpendicular, and a plane of rotation of the rear wheel defined as an imaginary geometrical plane which contains a diameter of the rear wheel and to which a rotational axis of the rear wheel is perpendicular intersect with a ground plane defined as an imaginary geometrical plane corresponding to an arbitrary surface on which the front wheel and the rear wheel are supported are substantially parallel to each other.\nAccording to this preferred embodiment, the front wheel-securing mechanism rotatably secures the front wheel in the folded state of the bicycle, such that lines of intersection where planes of rotation of the respective front and rear wheels intersect with the ground plane are substantially parallel to each other, so that when the folding bicycle in the folded state is moved by rotation of the wheels, the front and rear wheels rotate smoothly, thereby making it easy to move the folding bicycle.\nPreferably, the folding bicycle includes an auxiliary wheel arranged in the vicinity of the folding mechanism such that the auxiliary wheel can be placed on a ground together with the front wheel and the rear wheel in the folded state of the folding bicycle, a rack arranged above the rear wheel, and a rack retainer mechanism that holds the rack generally horizontal with respect to a ground plane on which the front and rear wheels are"} +{"output_text": "or 4-vinyl pyrrole, the prior art teaches that the polymer is conductive when doped with FeCl.sub.3, but the conductivity is not as high as that of PP.\nThe prior art also teaches that the conductivity of PP is increased by doping with FeCl.sub.3, but the conductivity is not as high as that of the polymer produced by Naarman.\nThe prior art also teaches that the conductivity of PP is increased by doping with FeCl.sub.3", "input_text": "orklund, R. B. and Lundstroem, I., Journal of Electronic Materials, Vol 13, No. 1, 1984.\nAs also stated in Bjorklund et al, they were aware that anhydrous FeCl.sub.3 used as a dopant with poly-p-phenylene exists as an FeCl.sub.4 (2.sup.-) complex in the polymer matrix, thus imparting conductivity to the polymer. Other polymers, for example polyacetylene impregnated with FeCl.sub.3 or other oxidants such as SbCl.sub.5, and, neutral polypyrrole which is exposed to FeCl.sub.3 vapor or an anhydrous solution of the electrolyte, is also made conductive. But impregnating a preformed polymer with FeCl.sub.3 to make it conductive does not suggest that one may use anhydrous FeCl.sub.3 as an initiator to form the polymer from the pyrrole monomer, or that the FeCl.sub.3 would generate a charged species in the polymer formed. As is well-known, poly-p-phenylene cannot be formed by initiation with FeCl.sub.3 (see \"Reaction of Ferric Chloride with Benzene\", by P. Kovacic and C. Wu, J. Polym. Sci. Vol XLVII pg 45-54 at pg 45, first sentence of \"Results\", 1960), and the polymer is not conductive unless post-treated with FeCl.sub.3.\nThe insulating character of PP produced by Naarman is attributable to the combination of AlCl.sub.3 and Cu.sup.+2 Cl.sub.2 as the initiator, further possibly to the low molar ratio of the initiator to the pyrrole in the reaction mixture.\nWith respect to polymers of 3- and/"} +{"output_text": " direction to allow for easy insertion into the ground. The lawn edging strip is made of a material that is resistant to rust and is preferably made of galvanized steel. The lawn edging strip is also preferably made of a material that is resistant to chipping and flaking. The lawn edging strip is preferably made of a material that is resistant to chipping and flaking. The lawn edging strip is preferably made of a material that is resistant to chipping and flaking. The", "input_text": "duous one. Furthermore, achieving a straight line across the top of the edging as well as a constant vertical alignment of the edging while back filling the trench is cumbersome and requires precise trenching and backfilling.\nThe second group of edging consists generally of edging that can be driven directly into the ground without trenching. Metal edging is an example of such edging. However, there are drawbacks with metal edging. It is often painted and will suffer chips and flakes over time, and it can also rust. This negatively affects the aesthetics of the edging. Also, the edges of the metal edging are typically narrow enough to pose a significant safety hazard. Children playing in the yard or people performing yard work around the edging run the risk of stepping or stumbling against the edging and being cut by the metal edges. Further, installation of metal edging can be cumbersome in that separate stakes are required that are positioned along the edging and hammered into the ground to force the edging into the ground. These separate stakes are an additional safety hazard because the top of the stake may rise above the top of the edging and pose an additional risk of snagging or cutting a person.\nOne conventional lawn edging device that overcomes some of the drawbacks of the above general types of edging is disclosed in U.S. Pat. No. 5,456,045, entitled xe2x80x9cLawn Edging Strip.xe2x80x9d This reference shows a lawn edging strip that can be inserted into the ground without the need for trenching or stakes while at the same time remaining rust free, of uniform color and safe. The lawn edging strip is structured such that it can be inserted into the ground in proper vertical alignment. It is rigid enough to withstand hammering of into the ground yet is flexible enough in a horizontal"} +{"output_text": "ters. Endriz uses a lens array to focus the light from a laser diode array onto a solid-state gain medium.\nThe laser optical systems of the prior art have several disadvantages. First, the lens arrays are relatively large and heavy. Second, the lens arrays are relatively expensive. Third, the lens arrays are relatively difficult to align. Fourth, the lens arrays are relatively difficult to manufacture. Fifth, the lens arrays are relatively difficult to maintain. Sixth, the lens arrays are relatively difficult to", "input_text": " less than the spacings of the light beams at the laser array. A 125-fold reduction in the light spot spacings from 250.mu.m at the laser array emitting surface to just 2.mu.m at the medium is typical. Even overlapped or concentric light beams are suggested as a possibility. Due to the point source nature of the light beams emitted from the described array, those beams typically have a large divergence angle. In order to enable a large portion of each of the light beams to be collected by the objective lens so that light spots of significant power are imaged onto the photosensitive surface, the optical system further includes an array of lenses, one for each laser source in the laser array, located in the optical path between the laser array and objective lens. Each lens in the lens array reduces the divergence angle of the light beam received from its associated laser source so that substantially all of the light collected by each lens will enter the objective lens without changing the apparent beam spacing of the laser array seen by the objective lens. Lenses with cylindrical symmetry may be used to shape the individual laser beams and to compensate for laser astigmatism.\nU.S. Pat. Nos. 4,972,427 to Streifer et al., 5,081,637 and 5,185,758 to Fan et al., and 5,168,401 to Endriz disclose other laser optical system of the prior art which position lens arrays in front of corresponding arrays of laser diodes. In the patents to Fan et al., the collimated light emerging from the lens arrays are focused by a lens in order to converge the beams so that they overlap in a solid-state gain medium for optically pumping that medium. Streifer et al. place a lens array within an external Talbot cavity. A separate cylindrical collection lens is use to reduce the divergence in the transverse direction perpendicular to the plane of the laser diode emit"} +{"output_text": " is in the first position, to a second position in which the boom is below the cab when the cab is in the second position.\nAccording to another aspect of the present invention a boom truck is provided comprising the following components: A chassis having a front end and a rear end, first and second sides, a front axle mounting front wheels, and a rear axle mounting rear wheels. A cab having a top, and positioned on the chassis between the chassis front and rear ends. A boom. A", "input_text": " as when the cab is located in the central portion of the chassis.\nAccording to the present invention a boom truck is provided that has the advantages of both of the different prior art constructions described above. The boom truck according to the present invention, when in an operating position, has the boom above the cab with the cab centrally located for good operator visibility. However when it is desired to transport the truck, the cab is moved to one side of the chassis, and the boom is lowered to a position where at least a portion (or substantially all) of the boom is below the top of the cab, for ease of transport. The operation of the truck according to the invention, providing for effective movement of the cab between operating and ease of transport positions, is much simpler than in other known configurations in which the cab position is movable for various reasons, such as shown in U.S. Pat. Nos. 3,963,132 and 4,630,700.\nAccording to one aspect of the present invention a boom truck is provided comprising the following components: A chassis having a front end and a rear end, first and second sides, a front axle mounting front wheels, and a rear axle mounting rear wheels. A cab having a top, and positioned on the chassis between the chassis front and rear ends. A boom. A support for the boom mounted between the cab and the chassis rear end. The boom pivotally mounted to the support for pivotal movement about a first, substantially horizontal, axis extending in an imaginary line intersecting the chassis sides. The cab movable with respect to the chassis from a first position positioned beneath the boom, to a second position to one side of the boom so that the boom may be positioned next to the cab with at least a portion of the boom below the top of the cab. And, the boom movable from a first position in which the boom is above the cab when the cab"} +{"output_text": " coatings.\nIn addition, the inventive coating compositions provide a significant improvement over the standard, commercial coatings in that they provide a significant reduction in the amount of silicone required to achieve the desired low permeability effect. As a result, the inventive coating compositions provide a significant reduction in the amount of silicone required to achieve the desired low permeability effect.\nFurthermore, the inventive coating compositions provide a significant reduction in the amount of silicone required to achieve the desired low permeability effect. As a result, the", "input_text": " invention as set forth in the claims.\nWith such an improvement in one-piece side curtain airbags (and inflatable fabrics), the possibility of high leakage at seams is substantially reduced. These airbags provide balanced weave constructions at and around attachment points between two layers of fabrics such that the ability of the yarns to become displaced upon inflation at high pressures is reduced as compared with the standard one-piece woven airbags. Unfortunately, such inventive one-piece woven bags are still problematic in that the weave intersections may be displaced upon high pressure inflation such that leakage will still most likely occur at too high a rate for proper functioning. As a result, there is still a need to coat such one-piece woven structures with materials which reduce and/or eliminate such an effect. However, such one-piece woven structures permit extremely low add-on amounts of elastomeric coatings for low permeability effects. In fact, these inventive airbags function extremely well with low add-on coatings below 1.5 and as low as about 0.8 ounces per square yard.\nFurthermore, although it is not preferred in this invention, it has been found that the inventive coating composition provides similar low permeability benefits to standard one-piece woven airbags, particularly with the inventive low add-on amounts of high tensile strength, high elongation, non-silicone coatings; however, the amount of coating required to permit high leak-down times is much higher than for the aforementioned Sollars, Jr. inventive one-piece woven structure. Thus, add-on amounts of as much as 1.5 and even up to about 2.2 ounces per square yard may be necessary to effectuate the proper low level of air permeability for these other one piece woven airbags. Even with such higher add-on coatings, the inventive coatings themselves clearly provide a marked improvement over the standard, commercial"} +{"output_text": ". H.; Gray, N. M.; Iyengar, S.; Jacobson, A. E. ; Rice, K. C. ; de Costa, B. R.: Evaluation of U50488H analogs for antiischemic activity in the gerbil. Brain Res. 1991, 546, 79-82; and Long, J. B.; Tortella, F. C.; Rice, K. C.; de Costa B. R.: Selective sigma ligands protect against dynorph", "input_text": " were identified to be potent and selective sigma ligands [B. R. de Costa et al, J. Med. Chem., 32(8), 1996-2002 (1989)]. Further structure activity studies with these compounds resulted in the identification of (+)- and (-)-cis-N-[3,4-dichlorophenylethyl]-N-methyl-2-(1-pyrrolidinyl)-cyclohexyl amines as extremely potent and selective ligands for the sigma receptor. These [Contreras, P. C.; Ragan,.D. M.; Bremer, M. E.; Lanthorn, T. H.; Gray, N. M.; Iyengar, S.; Jacobson, A. E. ; Rice, K. C. ; de Costa, B. R.: Evaluation of U50488H analogs for antiischemic activity in the gerbil. Brain Res. 1991, 546, 79-82] and related (ethylenediamines) compounds [Long, J. B.; Tortella, F. C.; Rice, K. C.; de Costa B. R.: Selective sigma ligands protect against dynorphin A-induced spinal cord injury in rats. Soc. Neurosci. Abs., 16, 1122 (1990) abs 461.4] were found to be effective as-protective agentsfor the damaging effects of ischemia and stroke in two different models of ischemia. See, for example, Long, J. B.; Tortella, F. C.; Rice, K. C.; de Costa B. R.: Selective sigma ligands protect against dynorphin A-induced spinal cord injury in rats. Soc. Neurosci. Abs., 16, 1122 (1990) abs 461.4; Contreras, P. C.; Ragan, D. M.; Bremer, M. E.;\nLanthorn, T"} +{"output_text": ", the sensors must be curved as well. This is difficult to achieve, and the sensors are therefore expensive.\nThe object of the present invention is to provide a sensor that is easy to produce, cheap in production, and relatively small.\nThe object is achieved by a sensor comprising a sensor surface and a plurality of sensors arranged on the sensor surface, each sensor comprising a sensor element and a sensor surface, the sensor surface being adapted to be touched by a finger of a user, the sensor element", "input_text": ".S. Pat. No. 4,429,413 and international patent application PCT/NO96/00082.\nFingerprint sensors may be exposed to long term use under varying and sometimes demanding conditions. The sensor therefore needs to have a robust surface and to be as insensitive to pollution in the fingerprint and on the sensor as possible. It must be capable of reading most fingerprints without being disturbed by latent prints from earlier use, and also be capable of imaging so-called xe2x80x9cdry fingersxe2x80x9d that represent a problem for optical sensors. In some cases, e.g. in credit cards or computer keyboards, it would also be advantageous if the sensor could be made compact.\nIn the view of costs there is also a demand for simplicity and minimizing of the number of parts.\nIt is an object of the present invention to provide a sensor being easy to produce, making them cheap in production, and also relatively small.\nIn addition to the solutions mentioned above the measuring of capacitance has been tried as a method to measure finger prints. Examples are shown in U.S. Pat. No. 4.353.056 and U.S. Pat. No. 5.325.442. While the ridges of the fingerprint touch the sensor surface the valleys have a small distance to the sensor surface, resulting in a difference in capacitance and/or conduction measured at the different sensors. Humidity may affect the measurements, but if it is evenly distributed throughout the fingerprint an analysis of the contrast between the measurements can provide a picture of it.\nAll the solutions mentioned above are based upon two-dimensional sensor arrays with dimensions comparable to the size of the fingerprint. These are expensive and difficult to produce, since they comprise a large number of sensors simultaneously measuring the surface.\nAlso, the known sensors disclose only flat sensor surfaces. As finger prints are curved"} +{"output_text": " of the present invention.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer interconnection structure.\nIn recent years, the integration density of semiconductor devices has been increased more and more, and the multilayer interconnection structure has been employed in order to increase the number of interconnections. In the multilayer interconnection structure, the interconnection layers are formed by alternately stacking a plurality of interlayer insulating", "input_text": " be construed as an admission that the reference is prior art to the present application. Photography of forming a direct positive image without employing a reversal step or using a negative film is well known.\nA conventionally known process of forming a positive image with a known direct positive silver halide photographic material may be essentially be classified into the following two types, with certain exceptions, considering the practical usefulness of the process.\nOne type produces a direct positive image, where a previously fogged silver halide emulsion is used and the fogged nuclei (latent image) in the exposed area is broken by solarization or the Herschel effect for development to thereby obtain the intended direct positive image.\nThe other type produces a direct positive image, where a non-fogged internal latent image-type silver halide emulsion is used and the emulsion is, after imagewise exposure, subjected to surface development during or after a fogging treatment to thereby obtain the intended direct positive image.\nThe above-mentioned internal latent image-type silver halide photographic emulsion means a silver halide photographic emulsion of a type such that the silver halide grains therein have light-sensitive nuclei essentially in the inside thereof and a latent image is formed essentially in the inside of the grains by exposure.\nThe method of the latter type generally has a higher sensitivity than that of the former type and is therefore suitable for uses which require a high sensitivity. The present invention relates to the latter type.\nVarious techniques are known in this technical field. For instance, U.S. Pat. Nos. 2,592,250, 2,466,957, 2,496,875, 2,588,982, 3,317,322, 3,761,266, 3,761,276 and 3,796,577 and British Patents 1,151,363, 1,150,553 and 1,011,062 illustrates essential techniques"} +{"output_text": "chain alkyl (meth)acrylate and a hydrophilic monomer is emulsified and dispersed in an aqueous emulsifier solution to form monomer liquid droplets 5 \u03bcm or less and the monomer liquid droplets is polymerized. Japanese Kokai Publication Hei-10-129921 (pp 2-3) discloses a water-based coating composition obtained in the following manner: a mixture of a long-chain alkyl (meth)acrylate and a hydrophilic monomer is emulsified and dispersed in an aqueous emulsifier", "input_text": " part of the common general knowledge as at the priority date of any of the claims.\nThroughout the description and claims of the specification the word \u201ccomprise\u201d and variations thereof, such as \u201ccomprising\u201d and \u201ccomprises\u201d, is not intended to exclude other additives, components, integers or steps. Since a resin composition dispersed in water may suppress the content of an organic solvent compared to that of a conventional resin dispersed in solvent, such a resin composition dispersed in water has been employed as an environment-friendly resin in a variety of uses such as water-based coating materials (coating compositions) for vehicles, plastic-molded products, domestic electric appliances, steel products, large scale constructions, aircrafts, building materials, construction materials, tiles, and craft products as well as adhesives, resist, printing ink. In a field of automotive coating compositions such as a clear top coating composition for finishing automobiles among such fields, the resin composition dispersed in water is required to give good appearance, sufficiently stable and excellent coating physical properties, and especially high water resistance and further along with increasing consciousness on environmental issues in recent years, in order to satisfy the requirement of low VOC (volatile organic compounds), a technique realizing sufficient reduction in the content of an organic solvent has been longed for.\nWith regard to a conventional resin dispersed in water, Japanese Kokai Publication Hei-6-192341 (pp 2-3) discloses a long-chain (meth)acrylate copolymer latex obtained in the following manner: a mixture of a long-chain alkyl (meth)acrylate and a hydrophilic monomer is emulsified and dispersed in an aqueous emulsifier solution to form monomer liquid droplets 5 \u03bcm or less and the monomer liquid droplets is polymerized. Japanese Kokai Publication 2000-313863 (pp 2-3) discloses an emulsion type adhesive composition obtained in the following manner: a monomer mixture containing a long-"} +{"output_text": " to be located in the same country. However, in many cases, the buyer and seller are located in different countries. For example, the buyer may be located in the United States, and the seller may be located in Japan. In this case, the buyer\"\"s bank must obtain payment from the seller\"\"s bank, and the seller\"\"s bank must obtain payment from the buyer\"\"s bank.\nIn the case of a buyer located in the United States and a seller located in Japan,", "input_text": ". The buyer\"\"s bank advises the seller\"\"s bank that an L/C has been opened in favor of the seller, and the seller\"\"s bank accepts the buyer\"\"s bank\"\"s guarantee to pay. The seller\"\"s bank advises the seller that an L/C has been opened in its favor, and the conditions which must be fulfilled for payment to occur. Usually, the seller\"\"s bank makes an irrevocable promise to pay the seller upon presentation of appropriate documents. The L/C document is considered an asset of the seller, and can be sold or assigned by the seller.\nDocumentation which the seller usually must present to obtain payment includes a bill of lading from its shipper, an invoice identifying the purchase, an appropriate insurance certificate, a certificate of inspection from an inspection firm confirming that the required goods are being shipped, export licenses and/or health inspection certificates, and certificates of origin used by customs personnel. After the correct documents are presented, the seller\"\"s bank pays the seller, then collects payment from the buyer\"\"s bank and delivers the presented documents to the buyer\"\"s bank. In turn, the buyer\"\"s bank obtains payment from the buyer.\nMeanwhile, the shipper, via a carrier, transports the goods to the buyer\"\"s location. The carrier requires presentation of the bill of lading, which was delivered to the seller, before transferring possession of the goods to the buyer.\nThe buyer obtains the bill of lading from its bank after payment, and then the buyer and its broker arrange for presentation of the bill of lading to the carrier and delivery of the goods to the buyer\"\"s location. Often, the carrier delivers the goods to the buyer\"\"s broker at the customs entry point of the buyer\"\"s country.\nDuring an international trade using a conventional letter of credit, the buyer and seller are assumed"} +{"output_text": " implemented by identifying multiple distinct allowed/valid programmed threshold voltage ranges separated by forbidden ranges. Each distinct threshold voltage range corresponds to a predetermined value for the set of data bits encoded in the memory device.\nTypically, the program voltage applied to the control gate during a program operation is applied as a series of pulses. In one embodiment, the magnitude of the pulses is increased with each successive pulse by a predetermined step size (e.g. 0.2 v, 0.3 v, 0.", "input_text": " devices, non-mobile computing devices and other devices. Electrical Erasable Programmable Read Only Memory (EEPROM) and flash memory are among the most popular non-volatile semiconductor memories.\nBoth EEPROM and flash memory utilize a floating gate that is positioned above and insulated from a channel region in a semiconductor substrate. The floating gate is positioned between the source and drain regions. A control gate is provided over and insulated from the floating gate. The threshold voltage of the transistor is controlled by the amount of charge that is retained on the floating gate. That is, the minimum amount of voltage that must be applied to the control gate before the transistor is turned on to permit conduction between its source and drain is controlled by the level of charge on the floating gate.\nWhen programming an EEPROM or flash memory device, such as a NAND flash memory device, typically a program voltage is applied to the control gate and the bit line is grounded. Electrons from the channel are injected into the floating gate. When electrons accumulate in the floating gate, the floating gate becomes negatively charged and the threshold voltage of the memory cell is raised so that the memory cell is in a programmed state. More information about programming can be found in U.S. Pat. No. 6,859,397, titled \u201cSource Side Self-Boosting Technique For Non-Volatile Memory,\u201d and U.S. Pat. No. 6,917,545, titled \u201cDetecting Over Programmed Memory,\u201d both of which are incorporated herein by reference in their entirety.\nSome EEPROM and flash memory devices have a floating gate that is used to store two ranges of charges and, therefore, the memory cell can be programmed/erased between two states (an erased state and a programmed state). Such a flash memory device is sometimes referred to as a binary flash memory device.\nA multi-state flash memory device is"} +{"output_text": "\nIn the conventional method of forming a bird's beak, the polysilicon is oxidized to form the bird's beak. The bird's beak is then implanted with a dopant to form the bird's beak. The bird's beak is then oxidized again to form a second bird's beak. The second bird's beak is then implanted with a dopant to form a second bird's beak. The second bird's beak is then oxidized again to form a", "input_text": " (17), or at the lower end of the floating gate near the drain impurity diffusion layer (16), or at both locations. In these cases, the conventional \"beak\" of the bird's beak is usually long and elongated, thus increasing the size of the cell and at the same time providing paths for current leakage and, therefore, low memory speed.\nThe formation of a conventional bird's beak in a polysilicon gate is better shown in FIGS. 1b and 1c. In FIG. 1b, layers of gate oxide (210), polysilicon (220) and nitride (230) are successively formed on substrate (200) and then patterned with a photomask layer (240) to define the floating polygate region (260). Subsequently, polysilicon layer (220) is oxidized whereby gate bird's beaks (221) and (221') are formed as well known in the art. It is proposed in this invention a method of implanting the polysilicon so as to decrease the growth of the protrusion of gate bird's beak as shown by reference numerals (221) and (221') in FIG. 1c to a smaller size and sharper shape shown by reference numerals (225) and (225'). It will be known by those skilled in the art that the smaller the birds' beak, the smaller is the encroachment under the polysilicon edge, and hence the smaller is the impact on the electric-field intensity between the corner edge of the floating gate (229) and the control gate (280) of the completed cell structure shown in FIG. 1d, and thus faster is the memory speed. (See S. Wolf and R.N. Tauber, \"Silicon Processing for the VLSI Era,\" vol. 2, Lattice Press, Sunset Beach, Calif., 1990, p. 438)."} +{"output_text": " read out is determined on the basis of the density values of the read data. The image processing section may read out data which has been written in a memory in line units (for example, line unit compression). The relative position of the lines to be read out is determined on the basis of the density values of the read data.\nIn the image forming apparatus, the image data is compressed by the block unit compression or the line unit compression. The compressed data is stored in a memory. The image", "input_text": "., images for each color), which are formed on each of image carriers, on a recording medium.\n2. Description of the Related Art\nAn image forming apparatus has been proposed which has a plurality of recording units (for example, laser beam printers) in which each recording unit irradiates a laser beam modulated in accordance with record information onto a photosensitive drum, develops an electrostatic latent image on the photosensitive drum by an electrophotographic process, and transfers the image to a transfer paper. Each color image is transferred and superimposed during transport of the transfer paper through the recording units by means of a transfer belt, thereby making it possible to form a multi-color image.\nIn this type of image forming apparatus, if there are mechanical mounting errors between the photosensitive drums, optical path length errors between the light beams, or changes in the optical path between the light beams, the images for each color formed by forming electrostatic latent images on the various photosensitive drums, and then developing and transferring them on the recording paper on the transfer belt, will not be registered correctly. For this reason, a pattern image for registration correction is read by an image sensor such as a CCD sensor, and the pattern image is transferred onto a transfer belt from the various photosensitive drums. The position of the registration correction pattern for each color is determined on the basis of the density values of the read data. Registration deviations on the photosensitive drum respectively corresponding to each of the colors are detected on the basis of the thus determined position. Image signals to be recorded are subjected to electrical corrections in accordance with the detected deviations, and/or a reflection mirror disposed in the optical path of the light beams is driven to correct changes in the optical path length or the optical path.\nThe image processing section of the image forming apparatus may read out data which has been written in a memory in block units (for example, block unit compression). The relative position of the blocks to be"} +{"output_text": " comfort and well-being would be realized if a warming device could be conveniently positioned with respect to a patient during thoracic surgery so that it could be quickly accessed and deployed during thoracic surgery with little or no time spent in retrieval.\nA warming device combining a clinical garment with a convective insert to provide comfort warming does not provide for therapeutic warming during abdominal surgery. Thus, even for a patient wearing a clinical garment with a convective apparatus as disclosed in WO 2003/086500, an upper body", "input_text": " No. 6,524,332, \u201cSystem and Method for Warming a Person to Prevent or Treat Hypothermia\u201d, commonly owned with this application.\nThe upper body thermal blanket 15 shown in FIGS. 1C and 1D is frequently used during thoracic, abdominal and pelvic surgery and/or in the post anesthesia care unit (PACU) to satisfy the need for therapeutic warming. As is known, a patient's core body temperature can drop to hypothermic levels quickly during surgery. To prevent or mitigate the effects of hypothermia, an upper body blanket may be deployed for therapeutic warming during the intraoperative period. However, the need for therapeutic warming often is ascertained only after surgery commences and it is inconvenient, and sometimes it is not possible, to interrupt attendance on a patient during a surgical procedure in order to locate and deploy a thermal blanket and bring it into operation. In such cases, therapeutic warming can be delayed until the patient enters the PACU, when the patient may have been hypothermic for a significant period of time. Given the frequency with which upper body thermal blankets are used during and after surgery, it would be very useful and clinically beneficial to conveniently position an upper body convective device with respect to a patient so that it could be quickly accessed and deployed during thoracic, abdominal, or pelvic surgery with little or no time spent in retrieval.\nA warming device combining a clinical garment with a convective insert to provide comfort warming does not provide for therapeutic warming during thoracic surgery. Thus, even for a patient wearing a clinical garment with a convective apparatus as disclosed in WO 2003/086500, an upper body thermal blanket must be unpackaged, made ready and deployed during such surgery. Warming may be indicated postoperatively in order to stave off hypothermia while the patient's recuperation proceeds. Manifestly, a substantial convenience and a significant gain in a patient's"} +{"output_text": " main back wired board fixedly mounted to a back of the shelf unit;\na sub back wired board positioned further to the rear of the main back wired board;\na plurality of connectors for plug-in units mounted on an inner surface of the main back wired board so as to be connected to the plug-in units inserted into the shelf unit;\na plurality of connectors for external cables mounted on an outer side of the sub back wired board so as to be connected to external cables;\n", "input_text": "-in units inserted into the shelf unit and disposed side by side;\na main back wired board fixedly mounted to a back of the shelf unit;\na sub back wired board positioned further to the rear of the main back wired board;\na plurality of connectors for plug-in units mounted on an inner surface of the main back wired board so as to be connected to the plug-in units inserted into the shelf unit;\na plurality of connectors for external cables mounted on an outer side of the sub back wired board so as to be connected to external cables;\na plurality of relay connectors mounted on a surface of the main back wired board, the surface opposing the sub back wired board, and electrically connected to the plurality of connectors for plug-in units; and\na plurality of relay connectors mounted on surface of the sub back wired board, the surface opposing the main back wired board, the plurality of relay connectors being electrically connected to the plurality of connectors for external cables,\nwherein the plurality of relay connectors on the main back wired board and the corresponding plurality of relay connectors on the sub back wired board are fitted together, and the main back wired board and the sub back wired board are electrically connected to each other.\nBy providing a telecommunications device in which relay connectors between an opposed main back wired board and sub back wired board are connected, and thus electrically connecting the main back wired board and the sub back wired board, the need for space within which to connect the terminals of the conventional coaxial cable between the main back wired board and the sub back wired board is eliminated. By eliminating the need for such additional space there is little need for the main back wired board and the sub back wired board to increase in size even with an increase in device functions.\nFurther, the above-described object of the present invention is achieved by providing a telecommunications device comprising:\na shelf unit;\na"} +{"output_text": " are then compressed into tablets.\nU.S. Pat. No. 5,266,581, issued Nov. 30, 1993, to Schmidt et al, describes a pharmaceutical composition containing nifedipine and PVP. The composition is prepared by dissolving nifedipine and PVP in a suitable organic solvent, and adding insoluble PVPP to the mixture. The mixture is then granulated. The granules are then compressed into tablets.\nU.S. Pat.", "input_text": " Mar. 21, 1989, to Tack et al, describes a combination pharmaceutical containing nifedipine and mepindolol. The nifedipine and mepindolol are granulated separately using conventional excipients via a wet or dry granulation process. The separate granules are then placed within hard gelatin capsules for oral consumption.\nU.S. Pat. No. 4,882,144, issued Nov. 21, 1989; and the resulting Reissued Patent, U.S. Pat. No. 33,963, issued Jun. 16, 1992 to Hegasy; as well as U.S. Pat. No. 5,264,446, issued 23, 1993, to Hegasy et al, describe a solid pharmaceutical composition containing dihydropyridines, and processes for their production. Specifically, Hegasy describes the dissolution of nifedipine and polyvinylpyrrolidone (PVP) in a limited amount of organic solvent to form a viscous slush containing nifedipine and PVP. The slush is then mixed with a cross-linked, insoluble polyvinylpyrrolidone (PVPP) to cause agglomeration. This agglomerated mixture is then compressed into tablets in which the nifedipine (or other dihydropyridine active agent) is uniformly spread throughout the tablet.\nAnother pharmaceutical composition containing dihydropyridines, PVP and insoluble PVPP is described by Schmidt et al in U.S. Pat. No. 5,266,581, issued Nov. 30, 1993. Here, the dihydropyridine and the PVP are dissolved in a suitable organic solvent, and a wetting agent is added thereto. Insoluble PVPP is then added to the mixture, and the mixture is granulated. The granules"} +{"output_text": " sensor output signal.\nRandom noise in the sensor output signal is caused by a number of factors. For example, random noise can be caused by thermal noise in the sensor output signal, by random noise in the sensor scanning mechanism, and by random noise in the sensor output signal processing circuitry.\nThe present invention is directed to a method and apparatus for correcting sensor output signal distortion caused by random noise in a sensor scanning mechanism.\nIn accordance with the present invention, a method and apparatus for correcting sensor", "input_text": " invention, the first ink can be replaced with the second ink contained in the second ink container, depending on the situation to perform the recording or after the first ink is exhausted. During this process, it is unnecessary to wash the ink flow passage in the head. The first ink and the second ink may have different colors. Alternatively, the first ink and the second ink may have an identical color. The invention relates to sensor systems and, more particularly, to non-imaging, scanning sensor systems which simultaneously correct sensor output signal distortion caused by random sensor noise and temporal instabilities in the sensor scanning mechanism.\nNon-imaging, scanning sensor systems are employed in many applications where it is desired to detect the presence of objects. For example, non-imaging scanning sensor systems employing an array of infrared detector elements positioned in a focal plane of a scanning optical system are used to passively detect the presence of objects at extended distances. The array is typically mounted on a gimballed sensor unit to scan a portion of a field of view and produce sensor output signals which are sampled and multiplexed for further processing by on-gimbal and off-gimbal circuitry.\nSensor output signals from the gimballed sensor unit can be degraded by distortion from a number of sources. Two of the largest and most common sources of sensor output signal distortion are timing, or temporal, instabilities in the sensor scanning mechanism, and random noise inherent in the sensor.\nTemporal instabilities can arise in the output signal of scanning system sensor units from environmental stresses which induce undesired spatial displacement in the sensor scanning mechanism. Uncompensated spatial displacements can interrupt the linear scanning process of the scanning mechanism in the sensor unit. These discontinuities in the otherwise linear scanning process then cause temporal instabilities in the sensor output signal. Similarly, timing errors in the sampling circuitry of the sensor unit cause temporal instabilities in the"} +{"output_text": " variable-length codes received from the coder 66 to generate a set of quantized coefficients. The inverse DCT 72 then transforms the quantized coefficients into a set of pixel values that are similar to the pixel values of the pre-compression I frame. The inverse DCT 72 is designed to mimic the inverse DCT of the decoder (FIG. 6).\nThe encoder 50 then encodes the pixel values of the reference frame using a technique that is similar or identical to the encoding technique used by", "input_text": " Huffman codes. These codes form the encoded data that represent the pixel values of the encoded I frame. A transmit buffer 68 then temporarily stores these codes to allow synchronized transmission of the encoded data to a decoder (discussed below in conjunction with FIG. 6). Alternatively, if the encoded data is to be stored instead of transmitted, the coder 66 may provide the variable-length codes directly to a storage medium such as a CD-ROM.\nIf the I frame will be used as a reference (as it often will be) for one or more non-I frames in the GOP, then, for the following reasons, the encoder 50 generates a corresponding reference frame by decoding the encoded I frame with a decoding technique that is similar or identical to the decoding technique used by the decoder (FIG. 6). When decoding non-I frames that are referenced to the I frame, the decoder has no option but to use the decoded I frame as a reference frame. Because MPEG encoding and decoding are lossyxe2x80x94some information is lost due to quantization of the AC and DC transform coefficientsxe2x80x94the pixel values of the decoded I frame will often be different than the pre-compression pixel values of the I frame. Therefore, using the pre-compression I frame as a reference frame during encoding may cause additional artifacts in the decoded non-I frame because the reference frame used for decoding (decoded I frame) would be different than the reference frame used for encoding (pre-compression I frame).\nTherefore, to generate a reference frame for the encoder that will be similar to or the same as the reference frame for the decoder, the encoder 50 includes a dequantizer 70 and an inverse DCT 72, which are designed to mimic the dequantizer and inverse DCT of the decoder (FIG. 6). The dequantizer 70 dequantizes the"} +{"output_text": ", an arylthio group, an acyl group, an acyloxy group, an acylamino group, an alkoxycarbonyl group, an aryloxycarbonyl group, an alkylaminocarbonyl group, an arylaminocarbonyl group, a sulfonyl group, a sulfinyl group, a sulfamoyl group, a carbamoyl group, a sulfonylamino group, a sulfinylamino group, a ureido group, a thioureido group, a sulfonylureido", "input_text": " of W2 and Z1 include following rings (in which each of Ra and Rb independently represents hydrogen atom or a substituent group): \nAmong the above rings, A-1 and A-2 are preferred.\nExamples of the heterocyclic rings consisting of the set of W1 and Y1 or the set of W2 and Z1 include following rings (in which each of Ra, Rb and Rc independently represents hydrogen atom or a substituent group): \nAmong the above heterocyclic rings, A-5, A-6 and A-7 are preferred. Each of Ra, Rb and Rc has the same meaning as that described hereinbefore for A1, A2, B1 and B2. Ra, Rb and Rc may be combined to from a saturated or unsaturated carbon ring (e.g., cyclohexyl ring, cyclopentyl ring, cyclohexene ring, and benzene ring), or a saturated or unsaturated heterocyclic ring (e.g., piperidine ring, piperazine ring, morpholino ring, tetrahydrofuran ring, furan ring, thiophene ring, pyridine ring, and pyrazine ring). In that case, the ring may have one or more substituent groups. Examples of the substituent groups are the same as those described hereinbefore for A1, A2, B1 and B2.\nEach of L1, L2, L3, L4 and L5 independently represents a methine group which may have one or more substituent groups. Examples of the substituent groups are the same as those described hereinbefore for A1, A2, B1 and B2. Among them, preferred substituent groups are an alkyl group, an aralkyl group, an aryl group, alkoxy group, an aryloxy group, an alkylthio group"} +{"output_text": "regation of antibodies is undesirable because it reduces the amount of radiolabeled antibody that can be administered to a patient, and also because it reduces the amount of radiolabeled antibody that remains in the patient after administration.\nIn order to overcome the problems associated with the use of radiolabeled antibodies, it has been proposed to use antibody fragments, such as F(ab\u2032)2, Fab, Fab\u2032, and Fv fragments, which retain the antigen binding specificity of the intact antibody", "input_text": " completely lack constant domain sequences but which bind antigen. (See, Bird et al., Science, 242, 423-426 (1988)).\nIn order to increase the efficacy of antibody molecules as diagnostic or therapeutic agents, it is conventional to covalently bind or complex desired molecules thereto, in particular effector or reporter molecules. Effector molecules essentially comprise molecules having a desired activity, e.g., cytotoxic activity. By contrast, a reporter molecule is defined as any moiety which may be detected using an assay. Examples of effector molecules which have been attached to antibodies include by way of example, toxins, anti-tumor agents, therapeutic enzymes, radionuclides, antiviral agents, chelating agents, cytokines, growth factors, and polynucleotides. Examples of reporter molecules which have been conjugated to antibodies include, by way of example, enzymes, radiolabels, fluorescent labels, phosphorescent molecules, chemiluminescent molecules, chromophores, luminescent molecules, and colored particles.\nWhile it is desirable to attach molecules to antibodies in order to impart a desired activity to the antibody or provide for the detection thereof, the attachment of desired molecules to antibodies is not always possible to carry out conveniently, or effectively, because such attachment may result in loss of antibody activity. In particular, current methods for generating radiolabeled antibodies for diagnostic and therapeutic use suffer from such limitations. For example, the ratio of target-specific versus non-specific uptake of radiolabeled antibodies used in tumor imaging is often low, resulting in unclear images or missing tumor sites. Moreover, the low therapeutic index of radiolabeled antibodies limits the use of high radiation doses in radiation therapy.\nThe underlying reason for such problems is largely because the labeling chemistry for introduction of the radiolabel results in the partial denaturation of the antibody structure, which in turn causes the antibodies to aggregate in vivo or in vitro. Agg"} +{"output_text": " used for etching or chemical vapor deposition (CVD) of a film on the wafer.\nThe wafer support is typically mounted to a pedestal by a plurality of bolts. The pedestal is typically fabricated from a metal such as aluminum. The pedestal may also be fabricated from a ceramic material such as aluminum oxide or aluminum nitride. The pedestal may also include various components which provide heating and/or cooling of the wafer. The pedestal may also include elements for clamping (chucking) a", "input_text": "istic Line Emission. This photon-induced process of x-ray emission is called X-ray Fluorescence, or XRF. FIG. 6B shows schematically X-ray fluorescence from the K shell and a typical x-ray fluorescence spectrum from a sample of aluminum is shown in FIG. 8. The spectrum is measured with a solid state, photon counting detector whose energy resolution dominates the natural line width of the L-K transition. It is important to note that these monoenergetic emission lines do not sit on top of a background of broad band continuous radiation; rather, the spectrum is Bremsstrahlung free. 1. Field of the Invention\nThe invention relates to semiconductor wafer processing equipment and, more particularly, the invention relates to electrostatic substrate supports having an RF bias electrode.\n2. Description of the Background Art\nA semiconductor wafer processing system for manufacture of integrated circuits (IC's) generally includes a vacuum chamber within which is mounted a wafer support during processing. The wafer support typically comprises a susceptor mounted to a pedestal. The pedestal is typically fabricated from a metal such as aluminum. The susceptor may be fabricated from laminated sheets of a polymer. However, for high temperature applications, the susceptor is typically fabricated from a ceramic material such as aluminum oxide or aluminum nitride. The susceptor typically contains various components which provide heating and/or cooling of the wafer. The susceptor may also include elements for clamping (chucking) a wafer to retain it in a stationary position upon the susceptor surface. Such clamping is provided by either a mechanical clamp or an electrostatic chuck. The susceptor may also include one or more electrodes for applying a bias voltage to the wafer. Such a bias voltage may be a direct current (DC) bias or a radio frequency (RF) bias. An RF bias may be used, for example, to supply or enhance power to a plasma"} +{"output_text": "ant gases, removal of reaction products, and management of water.\nA GDL is generally made of a carbon material such as carbon paper, carbon cloth, or carbon felt. However, a carbon material has a low strength and a low thermal conductivity, and thus it is difficult to apply the carbon material to a fuel cell stack. Accordingly, a GDL is generally made of a metal material such as stainless steel, nickel, or aluminum. However, a metal material has a high strength and a", "input_text": " generates from the reaction between hydrogen and oxygen in the air in the polymer electrolyte membrane fuel cell. If the freeze-thaw cycle is repetitively changed from a sub-zero temperature to an ordinary temperature, components of the fuel cell and interfaces between the components such as an MEA and a gas diffusion layer (GDL) may be physically damaged thereby reducing its electrochemical performance and durability. Therefore, for the stable operation of a hydrogen fuel cell vehicle, it is crucial to increase the durability of a fuel cell stack under such a freeze-thaw cycle condition.\nVarious attempts have been conducted to increase the freeze-thaw durability of a typical fuel cell. For example, Korean Pat. No. 10-0802749, registered in 2008, discloses a technology of increasing the durability by optimizing a fuel cell cooling line structure to reduce the freeze-thaw cycle. U.S. Pat. Application Publication Nos. 2010/0143813 and 2008/0102326 disclose technologies of increasing freeze start capability by optimizing a method for controlling operation of a fuel cell. Also, U.S. Pat. Application Publication No. 2008/0241608 discloses a method of operating a fuel cell by removing ice generated at a sub-zero temperature by heat. However, these methods are too complex to apply in reality, and their effects are also limited. Accordingly, for a mass production of hydrogen fuel cell vehicles, it is necessary to develop a new technology to improve the freeze-thaw durability while at the same time making the implementation process as simple as possible.\nAs commercialization of fuel cells progresses, much research and development is being conducted on a gas diffusion layer (GDL) that is an essential component for managing water in a fuel cell. A GDL is attached to the outer surface of anode and cathode catalyst layers in an MEA of a fuel cell to perform various functions such as supply of react"} +{"output_text": " separated from the broth by centrifugation. The biomass is then washed with water and dried. The biomass is then subjected to a hydrolysis step in which the biomass is treated with an aqueous solution of an acid, preferably hydrochloric acid, at a temperature of between 20 and 60xc2x0 C. for a period of between 1 and 24 hours. The hydrolysis step is preferably carried out in a stirred tank reactor. The hydrolysis step is preferably carried out in the presence of a buffer, for example sodium acetate", "input_text": " culture. Foaming may be controlled by using antifoaming agents such as fatty acid polyglycol esters for example. Plasmid stability may be maintained by the addition to the medium of suitable selectively acting substances, for example antibiotics. The introduction of oxygen or oxygen-containing gas mixtures such as air into the culture and thorough mixing using suitable stirring systems or the gas stream may be used to maintain aerobic conditions. The temperature of the culture is typically 25xc2x0 C. to 37xc2x0 C. The culture is continued until the maximum quantity of L-lysine has formed. This aim is normally achieved within 10 to 160 hours.\nExamples of suitable fermentation media may be found, for example, in patent specifications EP-B-0 532 867, U.S. Pat. No. 5,840,551 and U.S. Pat. No. 5,990,350.\nAnalysis of L-lysine may be performed by anion exchange chromatography with subsequent ninhydrin derivatisation, as described in Spackman et al. (Analytical Chemistry, 30, (1958), 1190) or it may be performed by reversed phase HPLC, as described in Lindroth et al. (Analytical Chemistry (1979) 51: 1167-1174).\nThe fermentation broths used for the process according to the invention preferably have an L-lysine content greater than 60 g/L (as lysine base) for a content of non-metabolised sugar of less than 5.0 g/L. Out of a total solids content of greater than 10 wt. %, the biomass preferably accounts for 1 to 4 wt. %. The content of by-products and vitamins from fermentation (amino acids, organic acids) is preferably less than 2 wt. %.\nIn the process according to the invention, the biomass present in the fermentation broth is"} +{"output_text": "C8alkoxy,\nR7 and R8 are independently C1-C8alkyl or C1-C8alkoxy,\nR13 is hydrogen, C1-C8alkyl, C1-C8alkoxy, C1-C8alkoxy-C2-C8alkylene, C1-C8alkoxy-C2-C8alkyleneoxy, C5-C6cycloalkyl, C5-C6cycloalkoxy", "input_text": " to 8 the B groups may be identical or different,\nE1 is oxygen or is selected from the group consisting of methylene, methyleneoxy and ethylene, each member of the group being unsubstituted or substituted by one R5 or by 2 radicals, R5 and R6, or is two separate radicals, R7 and R8, R7 being attached to the same atom as R1 and R8 to the same atom as R4,\nE2 is selected from the group consisting of methylene, ethylene, propylene and butylene, each member of the group being unsubstituted or substituted by one R9 or by 2 radicals, R9 and R10, or is two separate radicals, R11, and R12, R11 being attached to the same atom as R1 and R12 to the same atom as R4,\nG1 is O or N(R13),\nR1 is hydrogen, methyl, ethyl, methoxy or ethoxy,\nR2 and R3 are independently hydrogen, C1-C8alkyl, C1-C8alkoxy, C1-C8alkoxy-C2-C8alkylene or C1-C8alkoxy-C2-C8alkyleneoxy,\nR4 is hydrogen, C1-C8alkyl, C1-C8alkoxy, C1-C8alkoxy-C2-C8alkylene, C1-C8alkoxy-C2-C8alkyleneoxy, C5-C6cycloalkyl, C5-C6cycloalkoxy, phenyl, phenoxy or a 5- or 6-membered, saturated or singly to triply unsaturated heterocyclic radical,\nR5, R6, R9, R10 and R12 are independently C1-C8alkyl or C1-"} +{"output_text": " urea byproducts are not present.\nThus, there is a need for a method and system for reducing NOx emissions from an engine that addresses one or more of the problems associated with the conventional art. Furthermore, other desirable features and characteristics of the present invention will become apparent from the subsequent detailed description of the invention and the appended claims, taken in conjunction with the accompanying drawings and this background of the invention.", "input_text": "i.e., overdosing), which results in an apparent failure of the SCR system. Overdosing is rendered even more likely because commercially available NOx sensors can cross-react with NH3 at the tailpipe to report it as NOx, thereby providing a falsely high NOx reading. Thus, two contributing factors can cause apparent SCR failure: (1) unaccounted NH3 from HNCO hydrolysis due to urea byproduct decomposition; and (2) the NOx sensor's inability to distinguish between NH3 and NOx. Each of these two factors, alone or in combination, can lead an engine management system to increase the supply of urea, resulting in overdosing.\nIn addition to HNCO from incomplete urea decomposition, tar-like compounds are produced from urea byproducts at around 400\u00b0 C. and above in an engine aftertreatment system (e.g., during a DPF regeneration procedure at up to about 600\u00b0 C.). These highly undesirable materials can accumulate in the SCR and contribute to premature catalyst aging. Furthermore, during decomposition of pure urea and urea byproducts, very fine particulate matter are released and transported downstream by exhaust flow. The particulate matter can contribute to catalyst fouling/blinding, overdosing of NH3, detection of SCR failure, and/or premature catalyst aging.\nFurthermore, one important and persistent consequence of incomplete decomposition of urea is the occurrence of side reactions that form high molecular weight solid deposits, which in turn can have deleterious effects for SCR operation, engine performance, fuel efficiency, and impact system configuration and vehicle design. Deposit formation can vary with engine aftertreatment configuration, and can limit the degrees of freedom available for aftertreatment system and vehicle designers. While urea byproducts can decompose at high temperatures, efforts to decrease solid deposits can fail because at the high temperature of an engine exhaust, conditions that favor decomposition of"} +{"output_text": " of boron, then impregnating the precursor with this solution, drying the precursor, then calcining the precursor.\nWhen the catalyst contains silicon, one preferred method of the invention consists of preparing an aqueous solution of silicon, then impregnating the precursor with this solution, drying the precursor, then calcining the precursor.\nWhen the catalyst contains phosphorous, one preferred method of the invention consists of preparing an aqueous solution of phosphorous, then impregnating the precursor with this solution, drying the precursor", "input_text": " at the same time as the latter, if the catalysts contain at least one group VIB metal and at least one group VIII metal.\nWhen the catalyst contains at least one group VIB element, for example molybdenum, it is possible, for example, to impregnate the catalyst with a solution containing at least one group VIB element, dry then calcine. Molybdenum impregnation can be facilitated by adding phosphoric acid to solutions of ammonium paramolybdate, which thus also introduces the phosphorous function to promote the catalytic activity.\nIn a preferred implementation of the invention, the catalyst contains, as a promoter, at least one element selected from silicon, boron and phosphorous. These elements are introduced into a support already containing at least one IM-5 zeolite, at least one matrix as defined above, and preferably also containing at least one metal selected from the group formed by group VIB and group VIII metals.\nWhen the catalyst contains boron and/or silicon and/or phosphorous and optionally an element selected from group VIIA, halogen ions, optionally at least one element selected from group VIIB and optionally at least one element selected from group VB, these elements can also be introduced into the catalyst at various stages of the preparation and in various manners.\nThe matrix is preferably impregnated using the xe2x80x9cdryxe2x80x9d impregnation method which is well known to the skilled person. Impregnation can be carried out in a single step using a solution containing all of the constituent elements of the final catalyst.\nThe P, B, Si and the element selected from group VIIA halide ions can be introduced into the calcined precursor by one or more impregnation operations using an excess of solution.\nWhen the catalyst contains boron, one preferred method of the invention consists of preparing an aqueous solution"} +{"output_text": " the other hand, the MOS transistor is turned on by a signal from the vertical scanning circuit (VSC) and the signal charges Qsig are transferred to the floating diffusion region (FD) via the MOS transistor.\nThe signal charges Qsig are amplified by the source follower circuit and are outputted as a voltage signal.\nThe MOS transistor is turned off by a signal from the horizontal scanning circuit (HSC) and the signal charges Qsig are transferred to the floating diffusion region (FD)", "input_text": " a photodiode and a CCD shift register, and a device called APS (Active Pixel Sensor), comprising a photodiode and a MOS transistor.\nThe APS includes a photodiode, a MOS switch, an amplification circuit for amplifying a signal from a photodiode and the like in each pixel and has many merits that the \u201cXY addressing\u201d, \u201cmaking the sensor and the signal processing circuit into a single chip\u201d or the like is achievable. In recent years, attentions have been attracted to APS owing to a promoted miniaturizing technique of MOS transistors and a raised demand for \u201cmaking the sensor and the signal processing circuit into a single chip\u201d or \u201creducing the consumption power\u201d.\nFIG. 14 shows the pixel part of a conventional ASP and an equivalent circuit of a solid image pickup device using it. These were reported by Mr. Eric R. Fossum et al. at a work shop of IEEE in 1995. The configuration of the prior art will be briefly described below.\nThe photoelectric conversion part is an embedded-type photodiode (PPD) used in CCD or the like. By providing a concentrated p layer on the surface, the embedded-type photodiode can suppress the dark current occurring at its interface with SiO2 on it and can provide a junction capacity also between the n layer of the accumulation part and the p layer on the surface of it, thereby increasing the saturated charge quantity of the photodiode.\nThe photo-signal charges accumulated in the photoelectric conversion part is read out via the charge transfer means (TX) comprising a MOS transistor to the floating diffusion region (FD).\nThe signal charges Qsig are voltage-converted into Qsig/CFD by the capacity of this floating diffusion region (CFD) and the signals are read out through a source follower circuit not shown in FIG. 14.\nOn"} +{"output_text": " be driver incompatibilities (e.g., if a new game is downloaded, it may install a new version of a graphics driver that renders a previously-installed game, reliant upon an old version of the graphics driver, inoperable). A console may run out of local disk space as more games are downloaded. Complex games typically receive downloaded patches over time from the game developer as bugs are found and fixed, or if modifications are made to the game (e.g., if the game", "input_text": ", to the extent games or MMOGs require an online connection for the game to be playable, the piracy problem is mitigated since the user is usually required to have a valid user account. Unlike linear media (e.g., video and music) which can be copied by a camera shooting video of the display screen or a microphone recording audio from the speakers, each video game experience is unique, and can not be copied using simple video/audio recording. Thus, even in regions where copyright laws are not strongly enforced and piracy is rampant, MMOGs can be shielded from piracy and therefore a business can be supported. For example, Vivendi SA's \u201cWorld of Warcraft\u201d MMOG has been successfully deployed without suffering from piracy throughout the world. And many online or MMOG games, such as Linden Lab's \u201cSecond Life\u201d MMOG generate revenue for the games' operators through economic models built into the games where assets can be bought, sold, and even created using online tools. Thus, mechanisms in addition to conventional game software purchases or subscriptions can be used to pay for the use of online games.\nWhile piracy can be often mitigated due to the nature of online or MMOGs, online game operator still face remaining challenges. Many games require substantial local (i.e., in-home) processing resources for online or MMOGs to work properly. If a user has a low performance local computer (e.g., one without a GPU, such as a low-end laptop), he may not be able to play the game. Additionally, as game consoles age, they fall further behind the state-of-the-art and may not be able to handle more advanced games. Even assuming the user's local PC is able to handle the computational requirements of a game, there are often installation complexities. There may"} +{"output_text": " based advertising, location based information services, location based gaming, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information retrieval, location based information provision, location based information", "input_text": " in the environment. Alternatively or additionally, the fingerprint can built up dynamically by receiving submissions of signal measurements experienced by the actual devices of actual users in an ongoing training phase.\nThe determination of the mobile device's location may be performed according to a \u201cdevice-centric\u201d approach or a \u201cnetwork-centric\u201d approach. According to a device centric approach, each anchor or reference node emits a respective beacon signal. The mobile device takes measurements of beacon signals it receives from the reference nodes, obtains the locations of those nodes from the location server, and performs the calculation to determine its own location at the mobile device itself. According to a network-centric approach on the other hand, the reference nodes are used to take measurements of beacon signals received from the mobile device, and an element of the network such as the location server performs the calculation to determine the mobile device's location. Hybrid approaches are also possible, e.g. where the mobile device takes the raw measurements but forwards them to the location server to calculate its location (also sometimes referred to as an \u201cassisted\u201d approach).\nThere are various reasons why it may be desirable to be able to detect the location of a wireless device, such as to provide location based services. For instance, one application of a positioning system is to automatically provide a wireless mobile device with access to control of a utility such as a lighting system, on condition that the mobile device is found to be located in a particular spatial region or zone associated with the lighting or other utility. E.g. access to control of the lighting in a room may be provided to a wireless user device on condition that the device is found to be located within that room and requests access. Once a wireless user device has been located and determined to be within a valid region, control access is provided to that device via a lighting control network. Other examples of location based services or functionality include indoor navigation, location"} +{"output_text": " the crisper drawers are a necessity for any refrigerator, they are often the most difficult to clean.\nThe typical crisper drawer is comprised of a metal frame and a plastic liner. The metal frame is typically comprised of a plurality of metal bars which are welded together to form a rectangular frame. The plastic liner is typically comprised of a plurality of plastic panels which are welded together to form a rectangular frame. The plastic liner is typically comprised of a plurality of plastic panels which are welded", "input_text": " one of the main drawbacks of an inflatable pillow is that the air moves in the opposite direction as a result of the pressure applied by someone's head, leaving an uneven pressure gradient throughout the inflatable pillow. Most of the inflatable air pillows are made from vinyl material that will cause the user to feel hot and sticky. These traditional pillows are not breathable and can result in the person breaking out with acne on the chest, neck, and chin. However, the advantage of the inflatable pillow is the ease of storage and its light weight. 1. Technical Field\nThis invention relates to absorbent liners and, more particularly, to disposable absorbent liners for absorbing moisture and assisting to maintain perishable foodstuff at a fresh state within a refrigerated environment.\n2. Prior Art\nFor most people, household chores are a fact of life. Vacuuming debris from carpeting, polishing furniture, waxing floors and cleaning windows are necessary tasks which must be done regularly to ensure a healthy and clean household. In particular, keeping a clean kitchen is of utmost concern for many consumers. Because the kitchen is such a vital component of any home, most consumers take careful steps to make sure this room is clean and sanitary. Specifically, cleaning away spills and food debris from the interior of one's refrigerator is an important chore regularly completed by most household consumers. Most refrigerator shelving is comprised of rubber coated wire racks, as well as solid plastic shelves.\nProviding ample space on which to store gallons of milk, condiments, and a host of other goods, shelving is integral to any refrigerator design. A crucial element of any refrigerator is the \u201ccrisper\u201d drawers. Typically utilized to store fresh produce including a wide variety of fruits and vegetables, crisper drawers also provide ample storage for items such as luncheon meats, cheese and eggs. Although"} +{"output_text": ", 1995, which is incorporated herein by reference.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a high-density and a high-integration.\nIn recent years, a semiconductor device having a high-density and a high-integration has been developed. In order to realize a high-density and a high-integration, it is necessary to reduce the size of a semiconductor device. However, it is difficult to", "input_text": " horizontal axis and/or rotated about the vertical axis wherein a subject to be tested faces the space from a position relative to front face and moves the images into convergence in the center of the cubic space by observing the first and second images through an open front wall of the cubic space wherein the front face aperture is reduced in size when the cubic space is rotated. Optical frequency modulation (OFM) is believed to be a preferred format for signals carried in free-space optical links. When compared with the conventional technique of amplitude modulation (AM) and direct detection, the signal-to-noise ratio achieved with OFM is better by the amount (fFM/\u0394f)2, where fFM is the Frequency Modulation (FM) index and \u0394f is the maximum modulation frequency (or the signal bandwidth). The benefit of OFM is especially important for applications in which the optical power incident on the receiver station is weak. This would typically be the case for a free-space optical link, which achieves power-efficient performance.\nTypically, OFM is achieved by directly modulating the current driving a semiconductor laser, which results in both amplitude modulation and frequency modulation of the output. See FIG. 1(a). Any residual amplitude modulation (RAM) of a frequency-modulated (FM) laser will degrade the signal-to-noise ratio of a communications link or degrade the sensitivity of a remote sensing apparatus. This is because that amplitude modulation is indistinguishable from added noise.\nTypical communication signals, radar signals and channelized wideband electronic warfare signals have a bandwidth as high as 1-2 GHz. Thus, it is desirable to have a large FM Index and a maximum frequency excursion that may be in excess of 10-20 GHz. One example of a state-of-the-art device is described in IEEE J. Sel. Topics Quantum Electronics V.1, pp.461-465"} +{"output_text": ", i.e., materials having permeabilities greater than 1.0. The term \"high Curie temperature\" refers to materials having Curie temperatures greater than 300.degree. C. The term \"high resistivity\" refers to materials having resistivities greater than 10.sup.9 ohm-cm. The term \"low permeability\" refers to materials having permeabilities less than 1.0. The term \"low Curie temperature\" refers to materials having Curie temperatures less than 300.degree", "input_text": " field in a simple, single layer, i.e., monolithic structure, falls off as e.sup.-x so that at three skin depths, the field is 4.9% of maximum, at five skin depths, it is 0.67%, and at ten skin depths, the field is 0.005% of maximum. For some uses, thicknesses of three skin depths are satisfactory although at least five are preferred in most cases, ten or more may be required with some highly sensitive devices in the vicinity of large heating currents.\nThe devices of the patent and aforesaid application are operative for their intended purposes when connected to a suitable supply, but a drawback is the cost of the high frequency power supply. Where only a very low field may be permitted to radiate from the device, the frequency of the source is preferably maintained quite high, for instance, in the megahertz region, to be able to employ copper or other non-magnetic material having reasonable thicknesses.\nIn accordance with the invention of co-pending application of John F. Krumme, Ser. No. 543,443, filed Oct. 25, 1983, a continuation in part of Ser. No. 430,317, filed Sept. 30, 1982, now abandoned; both said applications being entitled \"Autoregulating Electrically Shielded Heater\", a relatively low frequency constant current source may be employed as a result of fabricating the normally non-magnetic, low resistivity layer from a high permeability, high Curie temperature material. Thus, the device comprises a high permeability, high resistivity first layer adjacent the current return path and a high permeability, preferably low resistivity second layer remote from the return path; the second layer having a higher Curie temperature than the first-mentioned layer.\nAs used herein, the term \"high magnetic permeability\" refers to materials having permeabilities greater than para-magnetic materials"} +{"output_text": " in the television receiver.\nFurther, according to the color cathode-ray tube relating to the present invention, the effective surface, or the effective surface and the non-aperture portion of the mask main body are formed in a curved surface having a radius of curvature in the short axis direction, and at each long side of the mask main body, at least one portion at an intermediate part in the long axis direction of the effective surface or the non-aperture portion is recessed from", "input_text": "ure portion of the mask main body are formed in a curved surface having a radius of curvature in the short axis direction, and at each long side of the mask main body, at least one portion at an intermediate part in the long axis direction of the effective surface or the non-aperture portion is recessed from any other adjacent part, in a direction to leave from the phosphor screen along the tubular axis.\nFurther, according to the color cathode-ray tube relating to the present invention, the effective surface, or the effective surface and the non-aperture portion of the mask main body are formed in a curved surface having a radius curvature in the short axis direction, and at each long side of the mask main body, at least a part of the effective surface and the non-aperture portion at the vicinity of the short axis is recessed from any other part adjacent in the long axis direction, in a direction to leave from the phosphor screen along the tubular axis.\nAccording to the color cathode-ray tube having the above-described structure, the provision of a recess on each of the long sides of the effective surface or the non-aperture portion of the mask main body makes it possible to maintain high-level strength for holding the curved surface of the effective surface of the mask main body even in the case where the radius of curvature of the effective surface of the mask main body has been increased along an increase in the radius of curvature of the external surface of the effective portion of the panel. Therefore, it is possible to provide a color cathode-ray tube which can minimize an occurrence of deterioration of color purity by restricting local deformation of a shadow mask in the process of manufacturing the shadow mask, thermal deformation of the shadow mask in the process of manufacturing the color cathode-ray tube, or oscillation due to the sound from the speaker when the color cathode-ray tube has been built"} +{"output_text": " a wire on the printed wiring board 5.\nThe conventional remote-controlled light receiving module 4 is mounted on the printed wiring board 5 in the following manner. First, the leads 3 of the remote-controlled light receiving module 4 are inserted into the lead inserting through holes 5 on the printed wiring board 5. Then, the solder 7 is applied to the leads 3 and the wires on the printed wiring board 5. The solder 7 is melted by heating the printed wiring board 5. As a result,", "input_text": " is shown in FIGS. 1A and 1B.\nReference numeral 1 shown in the figures is a package having an approximately cubic shape. The package 1 is made of resin which can be penetrated by the infrared light. A remote-controlled receiver IC which is not shown in the figure is sealed inside the package 1. Reference numeral 2 is a lens having a convex spherical surface created as an integrated part of the front surface of the package 1. The lens 2 is fixed at a location which is optimum for focusing the infrared light modulated by the remote control signal at the remote commander on the light receiving surface of the remote-controlled receiver IC.\nReference numeral 3 denotes a lead protruding from the bottom surface of the package 1 in a direction perpendicular to the bottom surface. In the conventional remote-controlled light receiving module 4, the shapes and the sizes of the sectional surfaces of the leads 3 are uniform at least at the bottom surface of the package 1 from which the leads 3 protrude. The leads 3 are created so as to extend straight out from the bottom surface of the package 1.\nSuch a remote-controlled light receiving module 4 is used in remote-controlled equipment such as a television receiver, tape recorder and air conditioner. Speaking in concrete terms, it is necessary to insert the external tips of the leads 3 of the remote-controlled light receiving module 4 into predetermined lead inserting through holes on a printed wiring board and solder the leads to wires on the board.\nFIG. 2 shows how the conventional remote-controlled light receiving module 4 is typically mounted on a printed wiring board. Reference numeral 5 shown in the figure is the printed wiring board whereas reference numerals 6 each denote a lead inserting through hole created on the printed wiring board 5 for inserting a lead 3. Reference numeral 7 denotes a piece of solder for connecting a lead 3 inserted into a lead inserting through hole 5 to"} +{"output_text": ") to be provided. The user profile generator also provides a user profile set-up menu user interface which allows a user to establish a user profile for a particular location of interest, e.g., home, vacation home, or work, for which the user would like a personalized storm warning to be provided, and a contact address, e.g., e-mail address or pager number, to which the personalized storm warning is to be delivered. The user profile set-up menu user interface", "input_text": " The main computer system receives weather information from one or more weather information sources, e.g., NEXRAD weather radar information provided by the government, local live weather radar information, and other weather information from local and/or remote sensors. NEXRAD weather radar information includes detailed storm attribute information describing the characteristics of storm cells. The NEXRAD storm attribute information also includes information on the direction and speed of movement of storm cells, from which a predicted track of these storms may be generated. The main computer system includes software for generating a predicted storm track from such NEXRAD data, or, more preferably, from NEXRAD data in combination with local live radar information. The local live radar information, which is less detailed, but which provides updated storm cell positions much more often than NEXRAD information, may be used in combination with NEXRAD information to enhance the accuracy of the predicted storm cell tracks.\nThe main computer system preferably also includes a user profile generator. The user profile generator provides various user profile set-up menu user interfaces which allow a user to establish a user profile. These menus may be accessed by a user by use of, for example, a personal computer connected to the main system computer over a network such as the internet. Using such menus, the user establishes a personal user profile which includes a particular location of interest, e.g., home, vacation home, or work, for which the user would like a personalized storm warning to be provided, and a contact address, e.g., e-mail address or pager number, to which the personalized storm warning is to be delivered. The set-up menu user interface also allows a user to define a storm profile, including storm attribute conditions for which the user would like a personalized storm warning to be provided, and the amount of advanced warning (e.g., based on predicted storm cell arrival time at the user location"} +{"output_text": "., in the slave laser. U.S. Pat. No. 5,742,941, issued to Verdiel, et al. on Apr. 21, 1998, entitled RING CAVITY LASER DEVICE, relates to a ring cavity which uses a beam from a master laser to control or lock the operation of a slave laser located in the ring cavity. It uses a non-linear medium in the cavity to avoid the need of insulators, e.g", "input_text": "IRRORS, relates to the use of tilted spherical mirrors in an unstable resonator to achieve asymmetric magnification to get \u201csimultaneous confocality\u201d and obviate the need for non-spherical mirrors. U.S. Pat. No. 4,247,831 issued to Lindop on Jan. 27, 1981, entitled RING LASERS, relates to a resonant cavity with at least 1 parallel sided isotropic refracting devices, e.g., prisms, with parallel sides at an oblique angle to part of light path that intersects said sides, along with a means to apply oscillating translational motion to said refracting devices. U.S. Pat. No. 4,268,800, issued to Johnston et al. on May 19, 1981, entitled, VERTEX-MOUNTED TIPPING BREWSTER PLATE FOR A RING LASER, relates to a tipping Brewster plate to fine tune a ring laser located close to a flat rear mirror A acting as one of the reflecting optics of the ring laser cavity. U.S. Pat. No. 4,499,582, entitled RING LASER, issued to Karning et al. on Feb. 5, 1980, relates to a ring laser system with a folded path pat two separate pairs of electrodes with a partially reflective input coupler at a given wavelength. U.S. Pat. No. 5,097,478, issued to Verdiel, et al. on Mar. 17, 1992, entitled RING CAVITY LASER DEVICE, relates to a ring cavity which uses a beam from a master laser to control or lock the operation of a slave laser located in the ring cavity. It uses a non-linear medium in the cavity to avoid the need of insulators, e.g., for stabilizing to suppress oscillations, e.g"} +{"output_text": ". As a result, the transistors must be designed to operate at lower voltages. However, as the supply voltages decrease, the transistors must be designed to withstand higher voltages.\nOne way to increase the current density of a transistor is to increase the carrier mobility of the transistor. For example, in a p-channel transistor, the carrier mobility can be increased by increasing the mobility of holes in the channel region. However, as the supply voltages decrease, the holes in the channel region become more difficult to", "input_text": " the second layer of semiconductor material 108 below the gate dielectric 118. The drain current ID can be controlled by applying a voltage to the gate electrode 122. In CMOS applications, the transistor 101 is typically used as a simple switch having an on mode and an off mode. When the transistor 101 is off, the drain current ID is substantially zero. When the transistor 101 is on, the transistor operates in saturation mode and the drain current ID flows between the drain region 112 and the source region 110. The magnitude of the drain current ID is approximated by the following formula:\n I D = \u03bc n \u2061 ( C ox 2 ) \u2062 ( W L ) \u2062 ( V gs - V th ) 2 . \nAs can be seen from the expression above, the drain current ID depends on many factors, including the carrier mobility (\u03bcn for n-channel devices, \u03bcp for p-channel devices), the gate oxide capacitance Cox, the ratio of the channel width W to the channel length L, the threshold voltage Vth of the transistor, and the gate to source voltage Vgs. Thus, a selected value for the drain current ID can be achieved by selecting particular values for Vgs, the width to length ratio W/L, the carrier mobility \u03bcn, and the gate oxide capacitance Cox.\nIn some applications, it is beneficial to have a relatively high current footprint, i.e., a high amount of current per surface area of a semiconductor substrate. However, as integrated circuit technology continues to scale downward, there are difficulties involved with maintaining a high current density while scaling down the dimensions of the transistors. For example, as the dimensions of the transistors continue to scale downward, the supply voltages available to the integrated circuit die typically decrease as well"} +{"output_text": " cycle. The voltage at HB node 122 is shown in FIG. 2b as a solid line, while the voltage at circuit node 150 is shown in FIG. 2b as a dashed line. The voltage at HB node 122 is shown in FIG. 2b as a solid line, while the voltage at circuit node 150 is shown in FIG. 2b as a dashed line. The voltage at HB node 122 is shown in FIG. 2b as a solid line, while the voltage at circuit node", "input_text": " and collector terminals, which are conduction terminals, and a base terminal as a control terminal.\nWhen MOSFET 112 is on and MOSFET 116 is off, HB node 122 is coupled to a positive voltage at circuit node 110 through MOSFET 112. When MOSFET 116 is on and MOSFET 112 is off, HB node 122 is coupled to ground node 108 through MOSFET 116. The switching of MOSFETs 112 and 116 causes the voltage potential at HB node 122 to alternate between the voltage potential of voltage source 106 and ground potential. The pulsating voltage potential at HB node 122 causes resonant inductor 128, primary winding 132, magnetizing inductance 134, and resonant capacitor 136 to resonate.\nMagnetizing inductance 134 is not an actual physical inductor, but is used in analysis to represent a portion of current through transformer 130 that is used to magnetize core 137. Energy is transferred from primary winding 132 to secondary winding 138 through magnetic coupling. A certain percentage of the power input to transformer 130, analyzed as the current through magnetizing inductance 134, is lost in core 137 because the core does not have a perfectly efficient magnetic response.\nAs HB node 122 toggles between ground voltage and the voltage potential of voltage source 106, power is transferred from primary winding 132 to secondary winding 138. A circuit node 152 is connected to secondary winding 138 as a center-tap. A secondary winding portion 138a is coupled between center tapped circuit node 152 and diode 142, while secondary winding portion 138b is coupled between center tapped circuit node 152 and diode 144. Diodes 142 and 144 rectify the current through secondary winding 138. Capacitor 146 is coupled between circuit node 150 and circuit node 152 to filter the voltage to a more steady DC voltage.\nFIG. 2b illustrates timing diagrams of voltages and currents at various circuit nodes of LLC resonant mode converter 100 through a full power transfer"} +{"output_text": ") or are too broad in their spectral lineshapes to be useful for imaging.\nThe use of other biologically interesting nuclides for MRI has been limited by the lack of suitable NMR imaging techniques. The most widely used technique for imaging nuclides other than water protons is the use of nuclear Overhauser effect (NOE) spectroscopy. (References 10-11). In this technique, a radiofrequency (RF) pulse is applied to a sample, and the resulting magnetization is then perturbed", "input_text": " 5-6). In recent years, several groups have conducted.sup.19 F NMR studies which have shed light on the molecular environment of anesthetics in the brains of rabbits and rats. (References 3, 7). Using a surface coil placed on top of the calvarium during halothane inhalation, two overlapping spectral features observed by d'Avignon and coworkers, perhaps 0.1-0.2 ppm apart, could be resolved through their different transverse relaxation times (T.sub.2). (Reference 3). The biexponential dependence of the spin-echo amplitude on echo delay reported in this study demonstrated that anesthetics in different molecular environments could be discerned in the brain in vivo using.sup.19 F NMR. Such environments, separated by chemical shifts of only about 0.1 ppm, had previously been reported by Wyrwicz et al. in high resolution studies of excised neural tissue. (Reference 4).\nNotwithstanding such attempts to use other compounds for NMR imaging, state-of-the-art biological magnetic resonance imaging (MRI) has remained largely restricted to the water proton,.sup.1 H.sub.2 O, NMR signal. The natural abundance of water protons, about 80-100 M in tissue, and its large magnetic moment make it ideal for most imaging applications. Despite its tremendous value as a medical diagnostic tool, however, proton MRI does suffer several limitations. Most notably, the water protons in lung tissue, and the protons in lipids of all interesting biological membranes, are notoriously NMR invisible as a result of the short T.sub.2 in such environments. (References 8-9). Other.sup.1 H signals and signals from other biologically interesting nuclides are either present in too low a concentration (10.sup.-3 to 10.sup.-1 M, compared to ca. 100 M for H.sub.2 O"} +{"output_text": " 12. The LSU 14 scans the photosensitive medium 10 to form an electrostatic latent image. The developing roller 16 is disposed to face the photosensitive medium 10, and supplies toner to the photosensitive medium 10. The toner supplying roller 18 supplies toner to the developing roller 16. The toner layer regulation unit 20 regulates the thickness of a toner layer formed on the developing roller 16.\nThe toner supplying roller 18 is disposed to face the developing roller 16, and supplies toner to the developing roller 16. The toner supplying", "input_text": " developing agent.\nDry-type developing methods using the powder state of toner include a two-component developing method using a two-component toner in which carrier particles used to transport toner particles are contained, and a one-component developing method using only toner without a carrier. The one-component developing method includes a magnetic one-component developing method and a nonmagnetic one-component developing method. In the magnetic one-component developing method, a developing operation is performed using a toner for the magnetic one-component development. In the nonmagnetic one-component developing method, a toner layer is formed on a developing roller using a toner for the nonmagnetic one-component development and is developed either in contact with not in contact with a photosensitive medium.\nIn the contact-type nonmagnetic one-component developing method, the price is very competitive. However, since it is difficult to attain dot reproducibility, line reproducibility, and high-resolution implementation, it is not easy to obtain a high quality image. Meanwhile, in the noncontact-type nonmagnetic one-component developing method, the structure of a developing unit is simple, and thus may be minimized. In addition, since attaining color reproducibility, edge reproducibility, high tone gradation, and high-resolution implementation is facilitated, a high quality image may be obtained.\nFIG. 1 schematically illustrates a noncontact-type developing unit for a conventional electrophotographic image forming apparatus. Referring to FIG. 1, the conventional electrophotographic image forming apparatus includes a photosensitive medium 10, a charging roller 12, a laser scanning unit (LSU) 14, a developing roller 16, a toner supplying roller 18, and a toner layer regulation unit 20.\nThe photosensitive medium 10 has a structure in which a photosensitive film formed of a photosensitive material is formed on the circumference of a metallic drum. The surface of the photosensitive medium 10 is charged by the charging roller"} +{"output_text": " application program. The application program uses the extension to identify the file format. For example, the extension \u201c.doc\u201d is used to identify a word processing file. The extension \u201c.doc\u201d is added to the file name of the word processing file. The application program uses the extension to identify the file format.\nThe conventional voltage comparator is unable to recognize the electronic file format. Therefore, the conventional voltage comparator cannot open the electronic file. As a result, the conventional voltage comparator cannot associate the", "input_text": " Accordingly, the fifth transistor 106 is turned off.\nTherefore, the emitter of the sixth transistor 107 receives a power supply voltage VDD of the voltage comparator 100. Then, the sixth transistor 107 is turned on, so that a low-level signal is output to the output terminal of the voltage comparator 100, as shown in FIG. 2B.\nFIG. 3 is a diagram showing an output signal in accordance with the input noise in the conventional voltage comparator. FIG. 3A shows an input voltage Vin and reference voltage Vref when noise is generated in an interval where the input voltage Vin is larger than the reference voltage Vref. FIG. 3B shows an output signal when the noise is generated.\nIn a stable voltage comparator, the output of the voltage comparator is maintained at low level, even though noise C is generated in the input voltage Vin at an interval where the input voltage Vin is larger than the reference voltage Vref. However, the above-described conventional voltage comparator responds to the noise C so as to output a high-level signal, as shown in FIG. 3. Most electronic files created by an application program have an external identifier tag assigned by the particular application program that was used to create the electronic file. The external identifier tag, which identifies the format in which the electronic file is stored, is a separate indicator that is attached to the electronic file. Generally, there are numerous specific file formats, such as word processing, database, spreadsheet, and graphics files. These specific file formats contain specialized information that only the application, which was used to create the electronic files, can fully interpret. Therefore, it is important that the application program used to create the electronic file is able to recognize and open the electronic file.\nOne way to associate the electronic files with the application program that created them is to use extensions. Extensions are a set of predefined characters added to the file name by the"} +{"output_text": ", which is costly.\nThe present invention is directed to a tensioning device for a stringed racket. The tensioning device includes a tensioning member having a first end and a second end. The tensioning member is movable between a first position and a second position. The tensioning member is movable between the first position and the second position by a tensioning member drive. The tensioning member drive includes a first member and a second member. The first member is movable between a first position and", "input_text": " 110. When the tension in the string multiplied by the distance of the string to the axel 145 matches the tension in the precompressed spring 110 multiplied by the distance of the spring 110 from the axel 145, the tension head 140 rotates along the axel 145, releasing the tension brake 130. The brake engages with the tension crank 120, preventing additional movement of the tension head assembly 100 along the winder bar 40.\nThe tension of the precompressed spring 110 can be manipulated and set by turning a knob 160 connected to the precompressed spring 110, causing the winding of the precompressed spring 110 to become looser or tighter. The precompressed screw is wound about a screw connected to the knob. Turning the screw changes the winding of the spring, which changes the tension. The distance of the precompression is normally very short. The screw, to which the knob 160 is mounted, and the precompressed spring 110 are set such that one unit, or partial unit, of turning changes the tension of the spring by one pound of force. Users in countries utilizing the metric system must purchase a machine set for kilograms instead of pounds since a change of tension in one pound of force is not equal to one kilogram of force. This presents a limitation. In addition, to make the distance of precompression greater, much larger spring would have to be used, which would not be practical. In addition, utilization of a precompressed spring 110 is limiting in that the spring becomes fatigued through repetitive use and constant tension. This fatigue can cause the tension in the strings attached to the racket 20 to decrease, decreasing the performance of the racket 20. Such fatigue also requires a user to take time to recalibrate the tension, lessening the effectiveness of the user and decreasing the rate of production. In addition, the fatigue of the spring requires that the spring be replaced"} +{"output_text": " helix 4 of a homeobox domain. The second penetratin peptide can include a peptide comprising helix 3 of a homeobox domain and helix 4 of a homeobox domain.\nIn another embodiment, the chimeric FGF of the present invention can include a first peptide having an amino acid sequence selected from the group consisting of:\n(i) X1-X2-X3-X4-X5-X6-X7-X8-X9-X10-X11", "input_text": " position 17 through 171 of SEQ ID NO:4 (TAT-FGF-2). Preferably, a biologically active FGF protein useful in a chimera of the present invention is encoded by a nucleic acid sequence comprising from nucleotide 59 to 523 of SEQ ID NO:1 (HLX-FGF-2) or from nucleotide 59 to 523 of SEQ ID NO:3.\nIn one embodiment, the penetratin peptide portion of a chimeric FGF of the present invention can include: (a) a first peptide having an amino acid sequence selected from the group consisting of:\n(i) X1-X2-X3-X4-X5-X6-X7-X8-X9-X10-X11-X12-X13-X14-X15-X16; and,\n(ii) X16-X15-X14-X13-X12-X11-X10-X9-X8-X7-X6-X5-X4-X3-X2-X1;\nwherein X1, X2, X3, X4, X5, X7, X8, X9, X10, X11, X12, X13, X14, X15, and X16 each represent an xcex1-amino acid, between 6 and 10 of which are hydrophobic amino acids; and wherein X6 represents Trp; and,\n(b) a second peptide comprising amino acid residues 49-57 of HIV Tat protein (SEQ ID NO:17). In a preferred embodiment, the second peptide of (b) does not comprise amino acid residues 22-36 or 73-86 of HIV Tat protein (SEQ ID NO:17).\nThe first penetratin peptide can include a peptide comprising helix 3 of a homeobox domain and"} +{"output_text": " No. 6,140,486) provides a resource for new methods of commercial PUFA production. However, these marine bacteria have limitations which may not allow their use in commercial PUFA production. First, the marine bacteria do not produce significant quantities of the long chain PUFAs such as EPA and DHA. Second, the marine bacteria produce very little total PUFAs of more than 16 carbons and do not produce significant quantities of the longer chain PUFAs such as EPA and DHA", "input_text": " one of the unassigned protein domains.\nThe PKS pathways for PUFA synthesis in Shewanella and another marine bacteria, Vibrio marinus, are described in detail in U.S. Pat. No. 6,140,486 (issued from U.S. application Ser. No. 09/090,793, filed Jun. 4, 1998, entitled \u201cProduction of Polyunsaturated Fatty Acids by Expression of Polyketide-like Synthesis Genes in Plants\u201d, which is incorporated herein by reference in its entirety).\nPolyunsaturated fatty acids (PUFAs) are considered to be useful for nutritional, pharmaceutical, industrial, and other purposes. An expansive supply of PUFAs from natural sources and from chemical synthesis are not sufficient for commercial needs. Because a number of separate desaturase and elongase enzymes are required for fatty acid synthesis from linoleic acid (LA, 18:2 \u0394 9, 12), common in most plant species, to the more saturated and longer chain PUFAs, engineering plant host cells for the expression of PUFAs such as EPA and DHA may require expression of five or six separate enzyme activities to achieve expression, at least for EPA and DHA. Additionally, for production of useable quantities of such PUFAs, additional engineering efforts may be required, for instance the down regulation of enzymes competing for substrate, engineering of higher enzyme activities such as by mutagenesis or targeting of enzymes to plastid organelles. Therefore it is of interest to obtain genetic material involved in PUFA biosynthesis from species that naturally produce these fatty acids and to express the isolated material alone or in combination in a heterologous system which can be manipulated to allow production of commercial quantities of PUFAs.\nThe discovery of a PUFA PKS system in marine bacteria such as Shewanella and Vibrio marinus (see U.S. Pat."} +{"output_text": " and Blu-ray discs have their own particular interactive format. Any home media device or local computer that might be developed to support all of the popular formats would require a level of sophistication and flexibility that would likely make it prohibitively expensive and complex for the consumer to operate.\nAdding to the problem, if a new format were introduced later in the future the local device may not have the hardware capability to support the new format, which would mean that the consumer would have to purchase an upgraded local media", "input_text": " Sonos\u00ae Digital Music system stream audio directly from the Internet. Likewise, devices like the Slingbox\u2122 entertainment player record video and stream it through a home network or out through the Internet where it can be watched remotely on a PC. And Internet Protocol Television (IPTV) services offer cable TV-like services through Digital Subscriber Line (DSL) or other home Internet connections. There have also been recent efforts to integrate multiple media functions into a single device, such as the Moxi\u00ae Media Center and PCs running Windows XP Media Center Edition. While each of these devices offers an element of convenience for the functions that it performs, each lacks ubiquitous and simple access to most media. Further, such devices frequently cost hundreds of dollars to manufacture, often because of the need for expensive processing and/or local storage. Additionally, these modern consumer electronic devices typically consume a great deal of power, even while idle, which means they are expensive over time and wasteful of energy resources. For example, a device may continue to operate if the consumer neglects to turn it off or switches to a different video input. And, because none of the devices is a complete solution, it must be integrated with the other stack of devices in the home, which still leaves the user with a rat's nest of wires and a sea of remote controls.\nFurthermore, when many newer Internet-based devices do work properly, they typically offer media in a more generic form than it might otherwise be available. For example, devices that stream video through the Internet often stream just the video material, not the interactive \u201cextras\u201d that often accompany DVDs, like the \u201cmaking of\u201d videos, games, or director's commentary. This is due to the fact that frequently the interactive material is produced in a particular format intended for a particular device that handles interactivity locally. For example, each of DVD, HD-DVDs"} +{"output_text": " device are complex and difficult to manufacture.\nAccordingly, there is a need for a full tissue resection device that is easy to manufacture and that provides a reliable and effective means for full tissue resection.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer interconnection structure.\nIn recent years, the integration density of semiconductor devices has been increased, and the multilayer interconnection structure has been adopted in order to", "input_text": " are hinged relative to each other. The staples would be located in a cartridge placed in the stapler head, and assume two rows forming a quarter circle curve and a straight line extending from the quarter circle curve. The anvil would be provided with a similar configuration. A knife would moves along a similarly shaped track inside of the staples. The stapler head would moved relative to the anvil (or vice versa) utilizing a cable and pulley arrangement in conjunction with a bevel gear. The proposed device would not be intended to perform an anastomosis, but rather would provide a full tissue resection and stapling repair of a smaller portion of the colon without requiring invasive abdominal surgery (i.e., an incision). It is believed that full tissue resections of the diseased portions of the colon should be a sufficient treatment where invasive abdominal surgery can be avoided, as the treatment is less traumatic and can be repeated if necessary.\nWhile the proposed full tissue resection device would provide a conceptual improvement over the prior art, it would still suffer from several drawbacks. First, a clamshell shaped arrangement with a hinged stapler head and anvil is non-optimal for several reasons. First, with a hinged arrangement, tissue is forced outward from the stapler as the stapler is closing, thereby risking a failure to obtain the desired tissue. Second, with the hinged arrangement the overall diameter of the clamshell and the stapler device is increased dramatically when the stapler is opened, thereby risking tearing of non-diseased tissue. Third, alignment of the staples with the receivers in the anvil is extremely difficult because the stapler head and anvil will not always close to the same location (i.e., depending on the amount and density of the tissue trapped therebetween). In addition, with a clamshell shaped hinged stapler, the mechanics of the"} +{"output_text": " molecular weight of 1,000 or less.\nIn the above formulas, the letter j is an integer from 0 to 5; u and h are each 0 or 1; s, t, sxe2x80x2, txe2x80x2, sxe2x80x3, and txe2x80x3 are each numbers which satisfy s+t=8, sxe2x80x2+txe2x80x2=5,", "input_text": " compounds having at least one carboxyl group include those of formulas (D1) to (D14) below. \nIn these formulas, R201 and R202 are each hydrogen or a straight or branched alkyl or alkenyl of 1 to 8 carbon atoms; R203 is hydrogen, a straight or branched alkyl or alkenyl of 1 to 8 carbon atoms, or xe2x80x94(R207)hxe2x80x94COOH; R204 is xe2x80x94(CH2)ixe2x80x94 (where i=2 to 10), an arylene of 6 to 10 carbon atoms, carbonyl, sulfonyl, an oxygen atom, or a sulfur atom; R205 is an alkylene of 1 to 10 carbon atoms, an arylene of 6 to 10 carbon atoms, carbonyl, sulfonyl, an oxygen atom, or a sulfur atom; R206 is hydrogen, a straight or branched alkyl or alkenyl of 1 to 8 carbon atoms, or a hydroxyl-substituted phenyl or naphthyl; R207 is a straight or branched alkylene of 1 to 10 carbon atoms; R208 is hydrogen or hydroxyl; the letter j is an integer from 0 to 5; u and h are each 0 or 1; s, t, sxe2x80x2, txe2x80x2, sxe2x80x3, and txe2x80x3 are each numbers which satisfy s+t=8, sxe2x80x2+txe2x80x2=5, and sxe2x80x3+txe2x80x3=4, and are such that each phenyl skeleton has at least one hydroxyl group; and xcex1 is a number such that the compounds of formula (D8) or (D9) have a"} +{"output_text": " Patent Document #2, there is disclosed a device that is provided with a fluid passage connection device that connects a fluid passage, and a valve mechanism that is closed by a spring, and that is provided with a valve body that is opened and closed by the spring, and a valve seat that is provided on the valve body, and that is provided with a valve seat surface that is opened and closed by the valve body.\nIn Patent Document #3, there is disclosed a device for fluid linking device", "input_text": " of systems (as of 3-1972) will be found in the treatise \"Advanced Waste Water Treatment Concepts\" by Dr. James E. Young, P.E. Research Consultant in Environmental Engineering, General Filter Company, Ames, Iowa, appearing in Bul. No. 7221, 3-72-2M-W, entitled \"GFC Conservation for `better water`\", published by General Filter Co. In, for example, the technical field of machining, it is often the case that a plurality of hydraulic clamp devices are fitted to a work pallet, and, in a state with the work being fixed by these clamp devices, the work is machined by a machining center. Many hydraulic clamp devices are driven to clamp by hydraulic pressure or by the elastic force of a spring, and are unclamped by hydraulic pressure. And there are provided a mechanism for positioning a work pallet to which the hydraulic clamp device is provided with respect to a base member and a clamping mechanism for fixing it there, and a fluid passage connection device that connects and separates a hydraulic passage for hydraulic pressure supplied to and vented from the work pallet.\nIn Patent Document #1, for a work pallet that can be fitted to or removed from a table of a machining center and for that table, there is disclosed a device provided with a positioning and fixing mechanism that positions and fixes the work pallet with respect to the table, and with a fluid passage connection device that connects a fluid passage. This fluid passage connection device includes a table side female coupler and a pallet side male coupler, and these male and female couplers incorporate valve mechanisms that are closed by springs, so that the pallet side male coupler may be separated from the female coupler while still maintaining a state in which fluid pressure is still remained.\nAnd, in a device for fluid linking device described in"} +{"output_text": " of the teeth to protect the horse's soft tissue from injury.\nIt is further desired to provide a system that allows the user to select the preselected shaped and surfaced files, rasps or other tools such as diamond cut-off blades from a set of preselected tools.\nIt is further desired to provide a system that allows the user to select the preselected shaped and surfaced files, rasps or other tools such as diamond cut-off blades from a set of preselected tools that are", "input_text": " associated with the molars and premolars. However, if the incisors are to long, opposing molars and premolars may be prevented from engaging properly.\nIn the prior art, hand tools similar to metal files or rasps were used to remove a selected portion of the tooth surface. These tools consisted of several shaped handles with pads mounted on one end. The pads accepted plates having an abrasive or specially designed file or rasp-toothed surface selected by the user. The mounted abrasive or rasp on the handle was then inserted into the horse's mouth and positioned against the tooth structure that needed to be altered. The user then manually applied pressure and movement to the handle until the selected portion of the tooth structure was removed.\nSome prior solutions to the problem were to add motor power to the burrs to provide a \u201cpower dental tool\u201d to replace the manual rasps. These solutions ease the manual work but introduced other problems such as the uncontrolled creation of dust and debris as well as the danger of injury to the horse and user from exposed high speed reciprocating or rotary burrs or rasps which may engage soft tissue such as the cheek, tongue, or gums inside the horses mouth.\nThus, there has long been a need for an arrangement that allows the user, usually a veterinarian, an owner or an equine dentist, to easily perform the removal of preselected material from the exposed surface of the horse's teeth without danger to the horse or the person doing the job.\nIt is desired that the arrangement allow the user to access the full array of teeth with a set of preselected shaped and surfaced files, rasps or other tools such as diamond cut-off blades.\nIt is further desired that the arrangement be motor driven but provide safety to the user and horse.\nIt is further desired to provide preselected shaped covers or guards around selected portions"} +{"output_text": " difference between the optical path of the light beam reflected by the facet 4 and the optical path of the light beam reflected by the facet 4'. The optical path length difference is caused by the difference in the angle of incidence of the light beam on the facet 4 and the angle of incidence of the light beam on the facet 4'. The optical path length difference is a function of the angle of incidence of the light beam on the facet 4 and the angle of incidence of the light beam on the facet 4'.", "input_text": "ing the light beam at a constant angular velocity. The deflected light beam is focused as a beam spot on a surface by a focusing lens system for scanning the surface.\nFIG. 2 of the accompanying drawings illustrates a conventional light scanning device of the type described. A light beam emitted from a light source 1 is focused as a linear image near a reflecting surface 4 of a rotating polygon mirror 3 by a first focusing optical system 2. The light beam reflected by the rotating polygon mirror 3 is deflected at a constant angular velocity upon rotation of the rotating polygon mirror 3. The deflected light beam is then focused as a beam spot on a surface 7 by a second focusing optical system comprising lenses 5, 6 for scanning the surface 7.\nThe light scanning device employing a rotating multi-faceted polygon, however, suffers from the problem of a facet error. That is, the mirror facets of the polygon may not lie parallel to the axis of rotation of the polygon mirror. One known method of solving this problem is to use an anamorphic optical system as the second focusing optical system disposed between the rotating polygon and the surface to be scanned, and to position the reflecting position on the rotating polygon and the scanning surface in conjugate relationship with respect to an auxiliary scanning direction (vertical direction in FIG. 3). In FIG. 3, the second focusing optical system couples the reflecting position on the rotating polygon 3 and the scanned surface 7 in substantilly conjugate relationship as viewed in the auxiliary scanning direction. Therefore, even if a mirror facet 4 of the rotating polygon suffers from a deviant orientation as represented by 4', the focused position on the scanned surface 7 is not virtually moved in the auxiliary scanning direction by the second focusing optical system. The facet error is corrected in this manner.\nWhen the polygon mirror 3 rotates, the reflecting surface or facet 4 rotates about an axis 3A, and there is developed an optical path length"} +{"output_text": " to a method for the production of a semiconductor device, and more particularly to a method for the production of a semiconductor device having a high-speed transistor.\nIn recent years, a high-speed transistor has been required for a semiconductor device. In order to realize a high-speed transistor, it is necessary to reduce the parasitic capacitance of a gate electrode. In order to reduce the parasitic capacitance of the gate electrode, it is necessary to reduce the thickness of a gate insulating film. However, when", "input_text": " the pitches of each of these pitchers is a daunting task, at best.\nAs will be appreciated, none of these prior patents even address the problem faced by applicant let alone offer the solution proposed herein.\nAgainst the foregoing background, it is a primary object of the present invention to provide a method for profiling pitches using a computerized pitching machine.\nIt is another object of the present invention to provide such a method which permits the pitching machine to be programmed to deliver pitches having a pitch profile of actual pitchers.\nIt is still another object of the present invention to provide such a method which can accommodate a variety of different pitchers and pitches without the need to separately program pitch parameters for individual pitchers to create a profile.\nTo the accomplishments of the foregoing objects and advantages, the present invention, in brief summary, comprises a method for profiling pitches of an actual pitcher using a programmable pitching simulator of the type having at least two wheels and a video display component. The method comprises the steps of: (a) creating pitch profile codes for all pitches that a pitcher can reasonably pitch, the pitch profile codes including information regarding pitch type, pitch speed and pitch movement; (b) developing a master pitch parameter table for each of the pitch profile codes, the pitch parameter table including all data reasonably necessary to program the programmable pitching simulator to throw profiled pitches; (c) developing pitch profile codes for a particular pitcher, the pitch profile codes also including a code for a video image to be displayed; (d) entering into the programmable pitching simulator the specific pitch profile codes for a particular pitcher by the use of a card containing the pitch profile codes; and (e) re-programming the programmable pitching simulator to deliver pitches with the same pitch profiles of the pitcher. The method can further include developing specific sequences of particular profiled pitches to a particular batter in the sequence that the pitcher has historically pitched to the batter. The invention relates"} +{"output_text": " a plurality of first and second frame terminals 152a and 152b, respectively. First frame terminal 152a and second frame terminal 152b are connected to a first and second frame terminal of connector 131, respectively.\nHandle 170 is rotatably connected to rotating connection portion 140 by a shaft 170a. A first end of shaft 170a is rotatably connected to a first frame terminal 152a of first frame terminal 152a. A second end of shaft 170a is rotatably connected to", "input_text": " formed through hood 122 for receiving hose assembly 130. Hood 122 also is provided with a transparent window 128 for notifying the user of the dust collecting state. Cover 124 encloses a motor compartment (not shown) where an electric motor and a suction fan driven by the electric motor are positioned. Further, a main electric cord 129 for applying an electric power from an external electric source to vacuum cleaner 100 is installed in the motor compartment. Main electric cord 129 is provided with a plug 129a at its free end.\nHose assembly 130 comprises a rigid wand 132 and a flexible hose 134, and is pneumatically connected to a dust collecting compartment (not shown) of canister unit 120 by a suction hose connector 136. Rigid wand 132 is rotatably connected to flexible hose 134 by a handle assembly 200.\nMeanwhile, FIG. 7 illustrates the conventional handle assembly 200 in detail. Handle assembly 200 mainly includes a connector 131, a rotating connection portion 140 and a handle 170. A free end of connector 131 is detachably connected to an end of rigid wand 132. Pipe hose 131 is rotatably connected to flexible hose 134 by rotating connection portion 140.\nRotating connection portion 140 comprises an inner pipe 142 and an outer pipe 144. Inner pipe 142 is rotatably installed in outer pipe 144. Outer pipe 144 is integrally formed with connector 131. That is, outer pipe 144 extends downward from connector 131. In a front end of inner pipe 142, a ring-shaped packing pipe 150 is disposed on an outer periphery of inner pipe 142. Packing pipe 150 provides an airtight seal between inner pipe 142 and outer pipe 144. In a middle position of inner pipe 142, a ring-shaped first frame terminal 152a and a ring-shaped second frame terminal 152b are mounted to an outer surface of inner pipe 142. First frame terminal 152a and second frame terminal 152b include"} +{"output_text": ", which is carried out by means of a screw press. The product obtained in this way has a grain diameter of about 50.mu.m.\nThe known processes for the production of dicalcium phosphate anhydride are not suitable for the production of products having a grain diameter of more than 50.mu.m.\nThe object of the present invention is to provide a process for the production of dicalcium phosphate anhydride which is suitable for the production of products having an average grain diameter", "input_text": " As is known in the art, products produced by such processes have an average grain diameter of less than 50.mu.m when underground.\nAccording to the known processes for the production of dicalcium phosphate anhydride, milk of lime is, for example, introduced into dilute phosphoric acid, the phosphoric acid solution thereby being heated to at least 70.degree. C. Subsequently, with vigorous stirring, highly concentrated milk of lime is added thereto as quickly as possible until a pH value of 6.5 is achieved in the resultant suspension, whereafter the reaction is practically finished. In the filtrate obtained after the separation of the solid material, there are generally found about 5 mg/1 of phosphorus pentoxide. The introduction of the milk of lime usually takes place by allowing the suspension to run in from the lid of the reaction vessel.\nIn the case of such a procedure, there is obtained a finely-divided material which, because of its fineness, can, in part, only be filtered with difficulty. A precipitation at comparatively low temperatures, as well as a slow introduction of the milk of lime, results in a co-precipitation of the dihydrate.\nIf an attempt is made to increase the average grain diameter by measures such as seeding of the reaction batch, longer residence times during the crystallization, lower precipitation temperatures or higher dilution of the reaction solutions, then only a small effect is obtained in the grain size.\nThe \"coarser\" crystals forming in purely statistical distribution can be separated by purely mechanical processes from the product stream by screening and/or sieving but the yield in the case of this procedure is extremely small. For this reason, it has been suggested to produce tablettable DCPA products by the roundabout way of drying DCPD (see EP 0 210 661). In addition to the precipitation, this process requires a compacting step"} +{"output_text": " and transmits the photographed image to the anomaly detection section. The anomaly detection section detects whether or not any anomaly occurs based on the image photographed by the image pickup camera and the image transmitted from the camera unit.\nIn the crime prevention camera apparatus, the image pickup camera photographs the surroundings of the object to be monitored and transmits the photographed image to the anomaly detection section. The anomaly detection section detects whether or not any anomaly occurs based on the image photographed by the image pickup camera and the image transmitted from the", "input_text": "\nThe microcomputer of the lens unit retains peculiar information concerning the lens unit in an internal storage section and transmits the peculiar information to the video camera connected as circuitry. Thus, the video camera can acquire the peculiar information concerning any lens unit attachable to the video camera and can perform given processing operation corresponding to the lens unit.\n\nHitherto, an intelligent remote monitor system including a plurality of predetermined cameras installed for monitoring the sites and spots of factories, buildings, etc., at remote locations, a camera unit for receiving video signals directly from the plurality of predetermined cameras, and a monitoring personal computer and a monitor support apparatus connected to the camera unit through the Internet has been available. (For example, refer to JP-A-11-266487.)\nIn the intelligent remote monitor system, the camera unit causes the predetermined cameras to perform given operation based on an operation signal received from the monitoring personal computer and performs predetermined processing for any of videos photographed by the cameras and then transmits a signal of the video to the monitoring personal computer and the monitor support apparatus.\nTherefore, in the intelligent remote monitor system, the monitoring personal computer is installed in a guardroom or overnight accommodations separate from a control room, whereby the sites and spots of the factories, buildings, etc., at remote locations can be monitored from the guardroom or overnight accommodations. The monitor support apparatus saves the received videos in sequence and detects whether or not situation change occurs based on the most recent video and its immediately preceding video.\nHitherto, a crime prevention camera apparatus including a camera unit having an image pickup camera, memory, a communication modem, etc., and an anomaly detection section of a vibration detector, a door opening/closing sensor, etc., has been available. This crime prevention camera apparatus is installed so that the image pickup camera of the camera unit photographs the surroundings of the object to be monitored"} +{"output_text": " which is applied to the examination subject by means of a radio-frequency antenna 5. The radio-frequency excitation pulse RF is applied to the examination subject in the examination volume 4, and is transmitted into the examination subject by the radio-frequency antenna 5. The radio-frequency excitation pulse RF is a short pulse of a few hundred microseconds duration, and is applied to the examination subject in the examination volume 4 at a power level of about 1.5 kW. The radio-frequency excitation pulse", "input_text": " set forth in the article \"NMR-Imaging Techniques and Applications: A Review\", Bottomly, Review of Scientific Instrument, 53(9), September, 1982pages 1319-1337.\nFor topical resolution in three dimensions, magnetic field gradients in three directions, preferably orthogonally disposed, must be produced. The x, y, z axes of a Cartesian coordinate system are shown in FIG. 1 to indicate the direction of the physical gradients G.sub.x, G.sub.y and G.sub.z produced by the gradient coils in the system of FIG. 1. The gradient coils 2, in the form of saddle coils, generate a physical magnetic field gradient G.sub.y in the y-direction. A substantially constant magnetic field gradient G.sub.y in the y-direction is generated within a spherical examination volume 4 by the conductor sections 2a. Due to their greater distance from the examination volume 4, the return conductors produce only negligible magnetic field components in the examination volume 4.\nThe gradient coils for generating the physical magnetic field gradients in the x-direction are constructed identically to the gradient coils 2 for the y-direction magnetic field gradient, but are rotated 90.degree. in azimuthal direction of the cylindrical carrier 1. For clarity, these gradient coils are therefore not shown in FIG. 1.\nThe gradient coils 3 for generating the physical magnetic field gradient in z-direction are annular, and are arranged symmetrically relative to the center point of the examination volume 4. The two individual coils 3a and 3b respectively carry current flowing in opposite directions, as indicated in FIG. 1, so as to produce a magnetic field gradient in z-direction.\nA typical pulse sequence of the EPI method is set forth below.\nAt the beginning of the pulse sequence, the examination subject is subjected to a radio-frequency excitation pulse RF,"} +{"output_text": " and (B) a multivalent metal ion source. The activating composition can also be an autodepositable, aqueous metal treatment composition that includes (Axe2x80x2) an aqueous dispersion of a phenolic novolak resin that includes water and a reaction product of a phenolic resin precursor, a modifying agent and a multi-hydroxy phenolic compound wherein the modifying agent includes at least one functional moiety that enables the modifying agent to react with the phenolic resin precursor and at least one ionic moiety", "input_text": "x9d in that the coating increases in thickness and areal density (mass per unit area) the longer the time the metal surface is immersed in the autodepositable composition.\nThe autodeposition characteristic of the invention is important to provide corrosion and environmental resistance. It allows for the formation of an exceptionally uniform and thin protective barrier. Excellent corrosion and environmental resistance is possible only if the entire surface of a metal part is protected with a barrier coating. This requirement is usually difficult to achieve on substrate surfaces that have a very complex topology. With the superior autodeposition of this invention, wetting and thus protection of such complex surfaces is achieved.\nAnother important advantage of the primer or coating composition is that a bath of the composition does not appear to change in composition as cumulative metal surfaces are dipped in the bath over a period of time. It is believed that since the very hydrophilic phenolic resin dispersion immobolizes or coagulates on the surface as a swollen wet gel rather than as a precipitate, the composition of the bath is the same as the deposited wet gel and the bath is not depleted.\nActivation of the metallic surface to prepare it for receiving the autodepositable composition can be achieved by pretreating the surface with an activating composition that generates freely-available multivalent ions on the surface. The activating composition can be an aqueous solution of multivalent ions such as calcium, magnesium, iron and manganese.\nThe activating composition can also be an autodepositable, aqueous metal treatment composition that includes (Axe2x80x2) an aqueous dispersion of a phenolic novolak resin that includes water and a reaction product of a phenolic resin precursor, a modifying agent and a multi-hydroxy phenolic compound wherein the modifying agent includes at least one functional moiety that enables the modifying agent to react with the phenolic resin precursor and at least one ionic moiety,"} +{"output_text": " determined that no signal is received.\nHowever, in the conventional system, when a trouble occurs on the transmitting side, the wayside controller cannot recognize the trouble, and therefore the wayside controller cannot carry out control to stop the train.\nFurther, when a trouble occurs on the receiving side, the wayside controller cannot recognize the trouble, and therefore the wayside controller cannot carry out control to stop the train.\nIn addition, when a trouble occurs on the transmitting side, the wayside", "input_text": " the existence of a train, a high reliability is required, because a control device on the ground (a wayside controller) utilizes a train detecting signal generated as described above to locate the train and to operate traffic signals for the train. Particularly, for the purpose of securing adequate safety in the train service, it is absolutely essential to avoid possibility that, although a train actually exists within a certain section forming a track circuit and therefore the pair of rails which form the track are short-circuited, a signal indicating no train in the section of the track circuit is erroneously transmitted, possibly due to a failure in a transmitter/receiver device, for example.\nConventionally, to solve such a problem, highly reliable equipment has been used for the transmitter/receiver devices installed in every track circuit, as well as for the wayside controller. When any trouble occurs in transmitting or receiving signals, the control which is carried out is as follows: i.e., no signal is transmitted on the transmitting side, and a determination is then made as to whether no signal is received on the receiving side.\nIn the conventional system as mentioned above, the large number of transmitter/receiver devices must be subject to very careful maintenance. Further, an individual signal cable is used for the connection between every transmitter/receiver device and the wayside controller, in order to avoid possible misrecognition of information among the devices.\nFurthermore, JP-A 6-92232 proposes that a signal, which has a different frequency for every track circuit, be used in order to avoid erroneously receiving a train detecting signal from an adjacent track circuit.\nTo sum up, as described above, when any trouble occurs in transmitting or receiving, the conventional system carries out control in such a manner that, if trouble occurs on the transmitting side, no signal is transmitted, and if it occurs on the receiving side, it is"} +{"output_text": ".\nDextromethorphan is a non-narcotic antitussive and antitussive agent. It is used in the treatment of coughs and colds. It is also used in the treatment of coughs and colds in combination with other antitussive agents.\nDextromethorphan is a racemic mixture of dextromethorphan hydrobromide and dextromethorphan dihydrobromide. The dextromethorphan hydro", "input_text": " xcexcm up to about a millimeter. As was indicated previously, with such an optical waveguide, a laser resonator can be fabricated, by having the resonator mirror placed on the two front surfaces of the optical waveguide. Such a laser is distinguished in that the large surface ensures effective removal of energy-loss heat.\nAdditionally, an optical waveguide has the advantage that it is via the large surface of an appropriately long optical waveguide that the energy-loss heat can be directed outward via the cover surface. For this purpose, two options are available. One is to mount the optical waveguide on a cooling plate and be in thermal contact with the cooling plate. The other possibility is to place the optical waveguide in a cooling chamber. Such a cooling chamber can be formed by having a hose around the optical waveguide, so that free space remains between the optical waveguide and hose. Through this space, a circulating fluid such as a coolant can be made to flow. The cooling cover and/or the coolant can assume a waveguide function for the pumping radiation.\nThere are instances when a long pump length must be attained in the pumping beam direction, particularly for the case of axial pumping. In such instances, the amplification medium should be pumped with radiation whose wavelength corresponds at least to a part of the weak absorption lines of the medium. In connection with a solid state medium doped with neodymium, it is pumped with pumping radiation whose wavelength is about 870 nm. This combination results in a highly efficient, long pumping extent in the direction of the optical resonator. By this means, possible parasitic oscillations can be suppressed.\nAdditional particulars and features of the invention can be gleaned from the following description of specific embodiment examples, using the drawings. This invention relates to the use of dextromethorphan, optionally encompassing salts, prodrugs and metabolites thereof, for the manufacturing of a medicament to be administered transdermally"} +{"output_text": " neurotransmitters, the mechanism of action is different for each serotype.\nThe mechanism of action of BT serotype A is to inhibit the release of acetylcholine (ACh) from the presynaptic membrane of the neuromuscular junction. The mechanism of action of BT serotype B is to inhibit the release of ACh from the presynaptic membrane of the neuromuscular junction and the release of neuropeptides from the presynaptic membrane of the autonomic ganglia. The mechanism of action of BT serotype", "input_text": "usually a drop in cell voltage) hundreds of vesicles merge with the cell membrane to release their neurotransmitters. The neurotransmitters diffuse across the synaptic space to bind to and excite the postsynaptic membrane of a second neuron.\nExocytosis requires specialized proteins on the vesicle and presynaptic membrane that are collectively known as the SNARE proteins. Removal of any of these proteins can stop vesicle docking to membrane and block or decrease neural signaling. One protein on the vesicle membrane called VAMP (vesicle associated membrane protein) and one on the presynaptic membrane called SNAP (synapse associated protein) are the targets of the botulinum and tetanus neurotoxins from the Clostridial bacterium.\nBotulinum toxin (BT) is a potent neurotoxin produced by the anaerobic gram-positive bacterium Clostridia botulinum and the closely related species Clostridia butyricum and beratti. When spores of the Clostridia botulinum are ingested they germinate and secrete BT that passes from the GI tract into the systemic circulation. The systemic spread of BT causes the disease botulism that is characterized by widespread neuromuscular paralysis.\nBT is a protein consisting of a light and heavy chain that together weigh approximately 150 kilodaltons. BT works by a three-stage mechanism, binding, translocation into the neuron and molecular action, each of which is performed by separate 50 kilodalton domains. The binding and translocation domains make up the heavy chain, while the catalytic action is performed by the single domain of the light chain.\nAt present seven immunologically distinct serotypes of the BT are known, named A, B, C, D, E, F and G. The effect of BT is to inhibit the release of neurotransmitters and neuropeptides by neurons. Although all BT serotypes interfere with proteins that cause the exocytosis of"} +{"output_text": " the smallest wavelength of light that is reliably produced and controlled.\nThe wavelength of light used in the photolithography process is limited by the materials used in the optics and the mask. The mask is typically made of a material that transmits the light waves selectively. The mask is typically made of a material that is transparent to the light waves, but is opaque to the light waves at the wavelength of light used in the photolithography process. The mask is typically made of a material that is opaque to", "input_text": " the integrated circuit. This pattern can be imaged onto a certain area on the substrate that has been coated with a layer of radiation-sensitive material known as photoresist or resist. Once the patterned layer is transferred the layer may undergo various other processes such as etching, ion-implantation (doping), metallization, oxidation, and polishing. These processes are employed to finish an individual layer in the substrate. If several layers are required, then the whole process or variations thereof will be repeated for each new layer. Eventually, a combination of multiples of devices, which may be integrated circuits, will be present on the substrate. These devices may then be separated from one another by dicing or sawing and then may be mounted into individual packages.\nOptical lithography may be 193 nm light, with or without immersion, or extreme ultraviolet (EUV) or X-ray lithography, or any other frequencies of light or any combination thereof.\nOptical lithography that uses 193 nm light waves works with refractive optics and transmissive photomasks or reticles. The masks block, partially block, or transmit the light waves selectively on to a substrate, which is typically resist-coated during the lithographic process, to partially expose or to expose different parts of the substrate or some material on the substrate. The masks are typically at 4\u00d7 magnification of the target substrate dimensions.\nExtreme Ultraviolet Lithography (EUV) uses approximately 13.5 nm wavelength of light with reflective optics. Some implementations use an anamorphic mask with magnifications of 8\u00d7 in one dimension and 4\u00d7 in the other dimension.\nIn general, smaller wavelengths of light are able to resolve finer geometries, finer spaces in between geometries, and a higher frequency (density) of features on the substrate. Also in general, smaller wavelengths of light are more difficult to reliably produce and control. Economically, it is best to use"} +{"output_text": " the printed page. The fuser roll is a hollow cylinder having a concave shape. The diameter of the ends of the roll is larger than the diameter of the center of the roll. The patent suggests that the mechanism of action is that the paper velocity through the fusing nip is greater at the ends of the roll than at the middle of the roll, thereby stretching out any wrinkles formed.\nU.S. Pat. No. 4,984,256, Yano, issued", "input_text": " pressure differential changes the velocity of the paper at various points along the nip, thereby keeping the paper moving through the nip straight.\nU.S. Pat. No. 4,930,202, Yano, issued Jun. 5, 1990, describes fixing rollers which have a non-uniform diameter across their length; the rollers either crown at their center or at their ends. The shaft through the roller is bent to parallel the surface shape of the roller. This structure is said to decrease paper wrinkling and bending of the roller shaft during use.\nU.S. Pat. No. 4,872,246, Yano, issued Oct. 10, 1989, describes fixer rolls which have a larger diameter at their ends than at their center (i.e., the rolls have a concave shape). The roll body is utilized on a curved shaft and this structure is said to minimize wrinkling of the printed page. See also, U.S. Pat. Nos. 4,803,877 and 4,870,731.\nU.S. Pat. No. 3,999,038, Sikes, Jr., et al., issued Dec. 21, 1976, describes a fuser roll having an hour-glass shape (i.e., a concave structure) wherein the diameter of the ends of the roll is larger than the diameter of the center of the roll. This structure is said to reduce wrinkling of the printed page, especially in duplex operations. The patent suggests that the mechanism of action is that the paper velocity through the fusing nip is greater at the ends of the roll than at the middle of the roll, thereby stretching out any wrinkles formed.\nU.S. Pat. No. 4,008,955, Bar-on, issued Feb. 22, 1977, describes a fuser roll structure which is said to minimize wrinkling of"} +{"output_text": " EEG signal is recorded from a single electrode. However, the present invention envisions the use of multiple electrodes for recording the EEG signal. The present invention envisions the use of multiple electrodes for recording the EEG signal. The present invention envisions the use of multiple electrodes for recording the EEG signal. The present invention envisions the use of multiple electrodes for recording the EEG signal. The present invention envisions the use of multiple electrodes for recording the EEG signal. The present invention envisions the use of multiple", "input_text": " the present invention envisions enhancement of detection by the use of the spatial domain as it applies to the positioning of detection and treatment electrodes.\nFinally, the present invention also envisions signal-to-noise enhancement for optimizing the detection of neurological events by searching for signals in a particular frequency domain. For example, a low-pass filter that excludes signals above 5 Hz could be used to enhance the reliability for detection of a neurological event for certain patients. In addition, detection may be enhanced by first conditioning the EEG signals using programmable, multiple step, signal processing. The processing steps that are envisioned for this signal conditioning include signal summing, squaring, subtracting, amplifying, and filtering.\nIt is also envisioned that any combination of techniques for signal detection in the time, spatial or frequency domain could be used for providing a highly reliable system for the detection of a neurological event.\nThe present invention envisions four different modalities for stopping the progression of a neurological event such as an epileptic seizure once it has been detected. A preferred method is to provide a responsive stimulation electrical signal, a second method is to release medication in response to the detection of an event, a third method is to provide an electrical short circuit in the vicinity of the epileptic focus to prevent the occurrence of a full epileptic seizure and a fourth method is the application of a sensory input through normal sensory pathways. Such sensory input could be acoustic (sound input), visual (light input), or other sensory input such as mechanical vibration or electrical stimulation of the skin. Of course it is envisioned that any two or more of these modalities can be used in combination in order to preclude, prevent or decrease the severity of a neurological event such as an epileptic seizure, migraine headache, Parkinson\"\"s disease tremor, etc. A valuable attribute of the present invention is the ability to record the EEG signal from any one or all of the detection electrodes. Typically the"} +{"output_text": " the elastic bodies are in a stretched state, and the elastic bodies are in a compressed state when the sliding bodies slide along the guide rails in lateral directions. Therefore, the elastic bodies are in a stretched state when the sliding bodies slide along the guide rails in lateral directions, and the elastic bodies are in a compressed state when the sliding bodies slide along the guide rails in longitudinal directions.\nTherefore, when the sliding bodies slide along the guide rails in lateral directions, the elastic bodies are in a stretched state", "input_text": "roof device such as described above, however, there is a risk that seal members mounted around edges of the opening in the roof. Especially for one at the front edge may be damaged at an earlier time due to frequent rubbing of the sliding lid against the seal member at the front edge of the opening.\nIn order to eliminate such a risk as described above, it is contemplated that the front end of the sliding lid in addition to a rear end is also lifted and/or lowered independently from the rear end thereof while the sliding lid is being opened or closed. But with the structure like this there would be caused unstable supports of the front and rear ends of the sliding lid, and resulting in the risk that the sliding lid may stagger in lateral directions.\nIn addition, in a conventional example as described above, since various types of guide means are disposed in parallel laterally of guide rails, the lateral width of the guide rail has to be increased, and this causes a problem that an opening formed in the roof has to be narrowed to an extent equal to such an increase in width.\nFurther, in a conventional vehicle sunroof device, sliding bodies are slidably attached to guide rails provided on both side edge portions of an opening formed in a roof of a vehicle, and rubber rollers are rotatably supported on the sliding bodies, whereby a movable panel (sliding lid) is moved to an opened state and/or a closed state when the rubber rollers progressively press against bottom sides of movable blocks fixed to the movable panel and production of looseness of respective parts is prevented by eliminating gaps therebetween by biasing the elastic bodies (Please refer, for instance, to Japanese Patent Publication No. Hei. 6-297952).\nHowever, with the conventional sunroof device described above, when the sliding bodies slide along the guide rails in longitudinal directions, the elastic bodies move after the sliding bodies while"} +{"output_text": " metallocenes with non-coordinating anions are also known, see for example U.S. Pat. Nos. 5,153,157, 5,198,401, 5,278,264, 5,304,614, 5,321,106, 5,329,033, 5,346,925, 5,348,962, 5,350,723, 5,391,790, 5,391,789, 5,399,636,", "input_text": " those known in the art, see again EP-A-277,004, WO-A-92/00333 and U.S. Pat. Nos. 5,198,401, 5,001,205, 5,324,800, 5,308,816, and 5,304,614 for specific listings. Selection of metallocene compounds for use to make isotactic or syndiotactic polypropylene, and their syntheses, are well-known in the art, specific reference may be made to both patent literature and academic, see for example Journal of Organmetallic Chemistry 369, 359-370 (1989). Typically those catalysts are stereorigid asymmetric, chiral or bridged chiral metallocenes. See, For example, U.S. Pat. Nos. 4,892,851, 5,017,714, 5,296,434, 5,278,264, WO-A-(PCT/US92/10066) WO-A-93/19103, EP-A2-0 577 581, EP-A1-0 578 838, and the academic literature xe2x80x9cThe Influence of Aromatic Substituents on the Polymerization Behavior of Bridged Zirconocene Catalystsxe2x80x9d, Spaleck, W., et al, Organometallics 1994, 13, 954-963, and xe2x80x9cansa-Zirconocene Polymerization Catalysts with Annelated Ring Ligands-Effects on Catalyst Activity and Polymer Chain Lengthsxe2x80x9d, Brinzinger, H., et al, Organometallics 1994, 13, 964-970, and documents referred to therein. Though many above metallocenes are directed to catalyst systems with alumoxane activators, the analogous"} +{"output_text": " are received from the host computer through a host command port 107. The host command port is connected to a host command processor 109. The host command processor is responsible for receiving commands from the host computer and for transferring command status responses from the RAID array to the host computer. The host command processor is also responsible for receiving status information from the RAID array and for transferring this information to the host computer. The host command processor is connected to a host data port 111. The host data port is connected", "input_text": ". Host data D, which is the information stored, retrieved and manipulated by the host computer, is for convenience referred to hereinafter simply as data D. Meta-data P is used exclusively by the disk array controller and perhaps other disk subsystem components for the control and maintenance of the disk array system. For example, one type of meta-data P may be parity information. Stripes are recorded as sequential blocks on a plurality of different disk drives. Each stripe includes a plurality of data blocks D and one additional set of blocks called parity blocks P. The parity blocks P contain the logical exclusive-OR (XOR) of the plurality of data blocks D, and is recorded on an additional disk drive. Conventionally, the parity blocks P are distributed among all the disk drives of an array, as shown in FIG. 3, in order to avoid drive contention during write operations. The use of parity blocks P improves availability of all of the data in a stripe. When one drive is unavailable, for example, the missing data block from a stripe can be reconstructed from the parity block and the available data blocks. The contents of the parity block is simply XORed with the data blocks remaining. The result of this XOR operation is the data from the missing drive. Once such a drive has been repaired, data can be restored to the repaired drive using the parity blocks and data blocks from each good drive in similar fashion.\nA typical RAID-based disk controller 101 is shown in FIG. 1. The controller is connected to a host computer (not shown), through a host port 103. Input/output (I/O) transactions are received through the host port by a host I/O processor 105. The host I/O processor is responsible for receiving commands from the host computer to the RAID array and for transferring data and command status responses from the RAID array back to the host computer. Commands"} +{"output_text": " the character described which is easy to manufacture and install in an electric circuit, because it is sufficient to attach (i.e. by soldering or otherwise) electrical conductors (e.g. wires) only to electrodes on the major faces of the ceramic element.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which is easy to manufacture and install in an electric circuit, because it is sufficient to attach (i.e. by soldering or otherwise)", "input_text": " interface plane).\nIt is another object of the present invention to provide a piezoelectric transformer of the character described in which such a deformation of the adjacent piezoceramic element sections produces a second voltage across the electrode segments at the adjacent sections of the ceramic element.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which may be easily and inexpensively produced.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which is easy to manufacture because it is sufficient to polarize each ceramic element only once and in only one direction.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which is easy to manufacture because it is sufficient to apply electrodes only to the major faces of a ceramic element, and which does not require application of electrodes to minor faces of the ceramic element.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described in which electrode segments on a single face of the piezoceramic element are electrically isolated from each other.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which electrically isolates the voltage and current at the input to the device from the voltage and current at the output of the device.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which is easy to manufacture and miniaturize, for example by using Micro Electronic Machining Systems (MEMS).\nIt is another object of the present invention to provide a piezoelectric transformer of the character described which is easy to connect or install in an electric circuit, because it is sufficient to attach (i.e. by soldering or otherwise) electrical conductors (e.g. wires) only to electrodes on the major faces of the ceramic element.\nIt is another object of the present invention to provide a piezoelectric transformer of"} +{"output_text": " to have to use two knobs to control a single device.\nAnother prior art solution is to provide a single knob that can be turned multiple revolutions. This solution is apparently common for consumer devices, but it is rare for industrial or laboratory equipment. This disparity is a good example of the fact that it is not user friendly to have to use a single knob to control a single device.\nAnother prior art solution is to provide a single knob that can be turned multiple revolutions. This solution", "input_text": " stored on many MP3 players is now into the thousands. Being able to rapidly move to a song location may require a lot of time, depending upon the interface that is provided for scrolling.\nA good analogy to this situation is tuning a radio that has a wide dynamic range. A radio typically has a simple hand-operated control. Tuning a radio to frequency 95.1 MHz over an entire range of 85-105 MHz is to control 1 part in 200. Using a single-turn \u201cknob\u201d or potentiometer, a single turn or revolution of the knob changes the frequency setting from a minimum of 85 MHz to maximum of 105 MHz. Thus, it becomes obvious why it is very hard to get the \u201cfine\u201d control that is necessary to dial into 0.1 MHz resolution. Fine and coarse control can also be thought of as slow and fast incrementing or decreasing of values.\nPrior art solutions for this problem have included a multi-turn potentiometer or knob. In this scenario, the knob can be turned multiple revolutions where one revolution might be equal to 2 MHz. In this way, it becomes much easier to dial in 0.1 MHz resolution (i.e., 0.1/2.0=> 1/20th revolution). But now a new problem has arisen. In order to move over the entire frequency range of 20 MHz will now require ten complete turns of the knob, which now becomes an annoyingly slow procedure. Interestingly, most radios and many industrial controls rely on this \u201cmany-turns-of-the-knob\u201d solution.\nAnother prior art solution is to provide two knobs. One knob is for coarse control, and the other knob is for fine control. This solution is apparently common for industrial or laboratory equipment, but it is rare for consumer devices. This disparity is a good example of the fact that it is not user friendly"} +{"output_text": " to produce a synthesis gas mixture (relation 3 below):CH4+CO2+H2S\u21c4CO+2H2+H2S\u2003\u2003(relation 3)\nThe synthesis gas mixture is then cooled down to a temperature in a range comprised between about 250\u00b0 C. and about 350\u00b0 C. upon exiting a third reactor, and introduced in a fourth reactor where the water gas shift reaction (relation 2) takes place. The synthesis gas mixture is then cooled down to", "input_text": " further need for a method for electroless deposition, such as copper electroless deposition, that does not require nucleation layers, implants, or activation baths for providing nucleation sites on surfaces on which the material is to be deposited. It is further desirable to provide a method for electroless deposition that provides a high conductivity layer of deposited material that adheres well to a substrate. The increasing use of hydrogen in chemical industries and oil refining and clean technologies puts pressure on hydrogen sources, hydrogen production capacities and hydrogen supplies. H2 is used for example for fuel desulfurization, production of ammonia NH3, methanol and other alcohol, urea, hydrochloric acid HCl, in Fischer-Tropsch reactions, i.e. conversion of CO and H2 into liquid hydrocarbon, as a reducing agent in metallurgy and for adding value to petroleum products and oils by hydrogenation. More than 41 million tons of H2 are produced annually, of which 80% by steam reforming, partial oxidation and auto thermal reforming of natural gas. Renewable hydrocarbons and biogas are also used as starting sources.\nMethane steam reforming (see relation 1 below) is performed at high temperature, typically between about 800\u00b0 C. and about 900\u00b0 C. The resulting H2 and CO gas mixture is cooled down to a temperature in a range comprised between about 350\u00b0 C. and 450\u00b0 C. upon exiting a first reactor, and introduced in a second reactor where a water gas shift reaction (WGS) takes place (relation 2 below):CH4+H2O\u21923H2+CO\u2003\u2003(relation 1)CO+H2O\u21c4H2+CO2\u2003\u2003(relation 2)\nThen H2 (40 mol %) is mixed with CO2 (55 mol %), CO (3 mol %) and H2S (1-3 mol %)"} +{"output_text": " re-use of the scouring pad.\nIt is another object of the present invention to provide a holder for a scouring pad that is simple in construction and inexpensive to manufacture.\nIt is a further object of the present invention to provide a holder for a scouring pad that is simple to use and convenient to store.\nIt is still another object of the present invention to provide a holder for a scouring pad that is convenient to use and convenient to store.\nIt is still a", "input_text": ",064 3,473,184 4,071,983 4,232,420 4,244,075 5,426,810\nIt will be seen from the prior art listed above that for an interval spanning more than seventy years a great deal of creativity was exercised to develop the many different types of devices of accomplishing a rather mundane, yet unpleasant, function, namely, the cleaning of cooking equipment and utensils. Close scrutiny of the structures and mode of operation of the devices described and illustrated in the patents listed above also indicate that there prior art devices are structurally and functionally different from the structure and mode of operation of the invention disclosed herein.\nThere are several different types of scouring pads, each being an article of manufacture that is generally available in a variety of stores where household goods and utensils are sold. One familiar type is sold under the trademark TUFFY and is formed from synthetic resinous strand material formed generally into a spherical body or mass that is customarily held in the hand and compressed when pressure is applied and the mass is manipulated to effect a scouring action. Another type of scouring pad is formed from stainless steel wire or strands, also gathered together during the manufacturing process to form a generally flat circular body or mass that may be manipulated by hand or with a holder to effect a scouring action.\nIt is not generally known that these two types of scouring pads may be used and re-used following a cleaning operation after use, such as might be effected in a conventional dishwasher. Accordingly, it is one of the important objects of the present invention to provide a holder for such scouring pads that will enable application of scouring pressure on the holder and therefore the detachably secured pad and manipulation thereof to effect a scouring action, while detachably retaining the scouring pad to facilitate removal of the scouring pad from the holder for cleaning, and"} +{"output_text": " device, must be manually held or grasped by the operator's hands, and in accordance with the aforenoted first mode, the operator must walk around the palletized load or product in order to apply the stretch film to the palletized load or product. In accordance with the aforenoted second mode, the operator must hold or grasp the film roll dispensing or holding device, and in accordance with the aforenoted first mode, the operator must walk around the palletized load or product", "input_text": " a known fact that approximately fifty per-cent (50%) of all stretch film that is manufactured is applied to, for example, palletized loads or products by manual means. It is also known that when applying such stretch film to, for example, palletized loads or products, the manner in which such stretch film is manually applied to such loads or products usually comprises either one of two methods. In accordance with a first one of such manual methods, as illustrated, for example, within U.S. Pat. No. 5,398,884 which issued to Stanford on Mar. 21, 1995, the operator respectively inserts four fingers of each hand into each one of two oppositely disposed recessed portions defined within the film core end caps so as to effectively hold or grasp the film roll, and while placing his thumbs upon outside surface portions of the film roll, so as to effectively cause a predetermined amount of back tension to be applied to the film whereby the film is effectively stretched as the film is being unrolled or dispensed from the film roll, the operator walks around the palletized load or product. In accordance with a second one of such manual methods of applying a stretch film to such palletized loads or products, as illustrated, for example, within U.S. Pat. No. 5,458,841 which issued to Shirrell on Oct. 17, 1995, and in lieu of directly holding or grasping the film roll, the operator holds or grasps a film roll dispensing or holding device which has a built-in tensioning mechanism.\nIn accordance with either one of the aforenoted modes, methods, or manners in which stretch film is applied manually to the palletized products or loads, several operational disadvantages or drawbacks common to both methods or modes were apparent. Firstly, for example, the film roll, or the film roll and film roll dispensing or holding"} +{"output_text": "comes the above-mentioned problems by providing a float/guide member which is uniquely adapted to receive the contact pins of the edge connector, allowing the pins to extend through the float/guide member and into the mother printed circuit board, whereby the card edge connector is able to move or float laterally on its contact pins with respect to the mother printed circuit board. The latter characteristic allows a daughter board associated with the card edge connector to be properly aligned and mated with a second card edge connector mounted remotely either", "input_text": " in oxygen atmosphere and normal atmosphere (air). In case of crystalline forms I, II and III the percent of oxidation degradation products was low in air atmosphere, while in oxygen atmosphere the percent of oxidation degradation products increased in crystalline forms II and III.\nWe can conclude that crystalline form I is stable to oxygen and oxidation, crystalline forms II and III are slightly sensitive to oxidation and crystalline form IV and amorphous atorvastatin are highly sensitive to oxidation. 1. Field of the Invention\nThe invention relates generally to a float/guide member which is adapted to be mounted between a mother printed circuit board and a card edge connector. More particularly, the invention pertains to a float/guide member which is uniquely adapted to receive the contact pins of the edge connector, allowing the pins to extend through the float/guide member and into the mother printed circuit board, whereby the card edge connector is able to move or float laterally on its contact pins with respect to the mother printed circuit board. The latter characteristic allows a daughter board associated with the card edge connector to be properly aligned and mated with a second card edge connector mounted remotely either on the mother printed circuit board or on a separate printed circuit board.\nA typical problem attendant with mother printed circuit boards provided with an in-line series of fixed card edge connectors is that they exceed the pad width tolerances on daughter boards and, when a daughter board is plugged into the connectors, open and shorted daughter board connections are apt to occur. When tighter tolerance pitch connectors, such as a 2.times.120 0.050 inch pitch card edge connector, is placed in-line with a smaller card edge connector, such as a 2.times.6 0.100 inch pitch connector, the problem of daughter board connections is even more prevalent, as is the case where one daughter board plugs into two card edge connectors on different printed circuit boards.\nThe subject invention addresses and over"} +{"output_text": " attached on a rear surface of the cabinet 90. The supporting material 93 is attached on a rear surface of the activated carbon 92. The diaphragm 94 is attached on a front surface of the supporting material 93.\nThe cabinet 90 is made of a material having a high acoustic stiffness, such as a wood material. The woofer 91 is made of a material having a low acoustic stiffness, such as a resin material. The activated carbon 92 is made of a material having a high acoustic stiffness, such", "input_text": " be self-activated only under a very limited condition, and a large start torque cannot be obtained.\n2) Since a magnetic balance has to be destroyed, a magnetic pole shape becomes special, and leakage flux is increased, thus the motor efficiency is lowered.\n3) As the two stator yokes have different magnetic pole shapes, two press molds are required, and management of parts becomes complicated.\nIn view of the above, this invention is to solve the above drawbacks and aims to simplify the connection of the coil unit and the circuit unit, and makes it easy to replace the circuit unit alone and adjust even after the assembling of the motor, and reduces the number of fitting parts, facilitates the assembly, reduces manhour, improves the motor performance, and provides an inexpensive brushless DC motor having a high starting torque, providing starting ability in a wide range, having a high motor efficiency and productivity. In a conventional loudspeaker system, it is difficult to realize, due to an effect of acoustic stiffness caused by an internal cavity of a cabinet, a loudspeaker system which is small and capable of bass reproduction. This reproduction limit of bass is determined depending on a degree of the acoustic stiffness, that is, a capacity of the cabinet. Thus, as one of the solutions to the problem of the reproduction limit of the bass, a loudspeaker system having an aggregate of granular activated carbon located in an inside of the cabinet thereof is suggested (for example, see patent document 1).\nFIG. 22 is a tectonic profile of a main section of a conventional loudspeaker system. In FIG. 22, the conventional loudspeaker system comprises a cabinet 90, a woofer 91, activated carbon 92, a supporting material 93, and a diaphragm 94. The woofer 91 is attached on a front surface of the cabinet 90. The activated carbon 92 is"} +{"output_text": " biologically degradable organic material.\nThe anaerobic fermentation of organic material is known. In this process, the organic material is mixed with a quantity of already fermented material as an inoculum for the active anaerobic fermentation, and the mixture is introduced at the top into a fermentation chamber in which a fermenting mass is situated, which moves from an inlet situated at the top towards an outlet situated at the bottom.\nThe fermenting mass is usually a mixture of a liquid and a solid phase. The liquid", "input_text": ". Biol. Chem. 272: 22472\u201322480, 1997. Thus other PTPase inhibitors are potentially effective in countering osteoclast activity, and thus treating osteoporosis.\nPTPases: Microorganisms\nDixon and coworkers have called attention to the fact that PTPases may be a key element in the pathogenic properties of Yersinia (reviewed in Clemens et al. Molecular Microbiology 5: 2617\u20132620 (1991)). This finding was rather surprising since tyrosine phosphate is thought to be absent in bacteria. The genus Yersinia comprises 3 species: Y. pestis (responsible for the bubonic plague), Y. pseudoturberculosis and Y. enterocolitica (causing enteritis and mesenteric lymphadenitis). A dual-specificity phosphatase, VH1, has been identified in Vaccinia virus (Guan et al., Nature 350: 359\u2013263 (1991)). These observations indicate that PTPases may play critical roles in microbial and parasitic infections, and they further point to PTPase inhibitors as a novel, putative treatment principle of infectious diseases. Availibility of PTPase inhibitors would help shed light in all the foregoing specualations about PTPase function because they would enable assaying techniques which would answer some of these questions as will be illustrated below. 1. Field\nThis invention relates to a method for anaerobically fermenting biologically degradable organic material, whereby this material is mixed with a quantity of already fermented material as an inoculum for the active anaerobic fermentation, and whereby this mixture is introduced at the top into a fermentation chamber in which a fermenting mass is situated, which moves from an inlet situated at the top towards an outlet situated at the bottom.\nB. Related Art\nBy organic material, here in particular the organic fraction of domestic waste is intended, and of similar industrial waste and other"} +{"output_text": "1 and S2 of the first and second periodic patterns is taken as S, the following relation is satisfied:\nL1xe2x89xa6L2xe2x89xa6Sxe2x89xa6S2.\nIn accordance with a tenth aspect of the present invention, there is provided a mask having periodic patterns, wherein each of the first and second periodic patterns has a unit pitch defined by a line and a space, wherein the first periodic pattern has a first pitch", "input_text": " of the present invention, there is provided a mask having a plurality of periodic patterns, wherein adjoining pattern portions (lines or spaces) of first and second periodic patterns of them, juxtaposed with each other, have opposite phases.\nIn accordance with a fifth aspect of the present invention, there is provided an exposure method including a process for exposing a photosensitive substrate to any one of the masks as recited above.\nIn accordance with a sixth aspect of the present invention, there is provided a multiple exposure method including a first exposure process using any one of the masks as recited above, and a second exposure process using another mask.\nIn accordance with a seventh aspect of the present invention, there is provided an exposure apparatus having an exposure mode for performing a process according to an exposure method as recited above, and a different exposure mode.\nIn accordance with an eighth aspect of the present invention, there is provided a device manufacturing method, characterized by an exposure process for exposing a wafer to a device pattern by use of an exposure method as recited above, and a developing process for developing the exposed wafer.\nIn accordance with a ninth aspect of the present invention, there is provided a mask having periodic patterns, wherein each of the first and second periodic patterns has a unit pitch defined by a line and a space, wherein the first periodic pattern has a first pitch P1 with a line width L1 and a space width S1 while the second periodic pattern has a second pitch P2 with a line width L2 and a space width S2, wherein the first and second pitches P1 and P2 are different from each other, wherein the first and second periodic patterns are juxtaposed with each other with respect to the periodicity direction, with a spacing D, and wherein, when one of the line widths L1 and L2 of the first and second periodic patterns is taken as L while one of the space widths S"} +{"output_text": " microcomputer 53, and a communication circuit 6 to perform data communications with the other ECUs 50b-50f. \nThe ECU 50b comprises an input circuit 2 to perform input processing of signals from a sensor, a switch and the like, a microcomputer 53 to perform various types of computing based on the input signals captured through the input circuit 2, an output circuit 4 to output control signals computed in the microcomputer 53 to an actuator and the like, a power circuit 5 to", "input_text": " control system and, more particularly, to a control system which enables a reduction in electric power consumption of backup power by making use of data communications between control units.\n2. Description of the Relevant Art\nIn recent years, a lot of ECUs (Electronic Control Units) have been mounted on vehicles for electronic control, and various types of control have been performed. Among these ECUs, an ECU for EFI for performing fuel injection control of an engine, an ECU for ABS for performing drive control of solenoid valves for hydraulic control and the like, an ECU for a transmission for performing drive control of solenoid valves for shifting and the like, an ECU for air bags for controlling spreading of air bag systems, and an ECU for body work for controlling keyless entry and the like are exemplified. The specifications of these ECUs mounted on vehicles very with car models and grades.\nRecently, in order to increase the efficiency of control processing between these ECUs and reduce component costs or the like, multiple ECUs have been mutually communicably connected through a communication line so that they can share data of each ECU.\nFIG. 11 is a block diagram schematically showing the construction of a control system including conventional ECUs. Reference numeral 60 in the figure represents a control system, which is constituted of ECUs 50a-50f mutually connected through a communication line 12. A vehicular LAN system is constructed of these ECUs 50a-50f. \nThe ECU 50a comprises an input circuit 2 to perform input processing of signals from a sensor, a switch and the like, a microcomputer 53 to perform various types of computing based on the input signals captured through the input circuit 2, an output circuit 4 to output control signals computed in the microcomputer 53 to an actuator and the like, a power circuit 5 to provide stable power supply voltage to the"} +{"output_text": " thick film is desired.\nIn an attempt to overcome the problems associated with conventional drying methods, U.S. Pat. No. 5,948,430 to Zerbe et al. and U.S. Pat. No. 6,001,917 to Zerbe et al. disclose the use of a vacuum-assisted drying process. The vacuum-assisted drying process is a relatively new drying method that has been developed to overcome the problems associated with conventional drying methods. The", "input_text": " accurate dosage form and instead attempted to solve this problem by forming a multi-layered film. Moreover, his process is a multi-step process that adds expense and complexity and is not practical for commercial use.\nOther U.S. Patents directly addressed the problems of particle self-aggregation and non-uniformity inherent in conventional film forming techniques. In one attempt to overcome non-uniformity, U.S. Pat. No. 5,629,003 to Horstmann et al. and U.S. Pat. No. 5,948,430 to Zerbe et al. incorporated additional ingredients, i.e. gel formers and polyhydric alcohols respectively, to increase the viscosity of the film prior to drying in an effort to reduce aggregation of the components in the film. These methods have the disadvantage of requiring additional components, which translates to additional cost and manufacturing steps. Furthermore, both methods employ the use the conventional time-consuming drying methods such as a high-temperature air-bath using a drying oven, drying tunnel, vacuum drier, or other such drying equipment. The long length of drying time aids in promoting the aggregation of the active and other adjuvant, notwithstanding the use of viscosity modifiers. Such processes also run the risk of exposing the active, i.e., a drug, or vitamin C, or other components to prolonged exposure to moisture and elevated temperatures, which may render it ineffective or even harmful.\nIn addition to the concerns associated with degradation of an active during extended exposure to moisture, the conventional drying methods themselves are unable to provide uniform films. The length of heat exposure during conventional processing, often referred to as the \u201cheat history\u201d, and the manner in which such heat is applied, have a direct effect on the formation and morphology of the resultant film product. Uniformity is particularly difficult to achieve via conventional drying methods where a relatively"} +{"output_text": " surface. The tool can be used to take formation fluid samples, or to perform other formation testing.\nThe formation testing apparatus can be used to perform a variety of formation testing operations. For example, the formation testing apparatus can be used to measure the pressure of the formation fluid, to measure the resistivity of the formation fluid, to measure the permeability of the formation fluid, to measure the porosity of the formation fluid, to measure the temperature of the formation fluid, to measure the pH of the formation", "input_text": " the pressure testing and fluid sampling of potential hydrocarbon reservoirs as soon as the borehole has been drilled into the reservoir, without removal of the drill string. Further, there is a need for a method and apparatus that will allow for adjusting drilling fluid density in response to changes in downhole pressures to achieve maximum drilling efficiency. Finally, there is a need for a method and apparatus that will allow for blow out prevention downhole, to promote drilling safety.\nA formation testing method and a test apparatus are disclosed. The test apparatus is mounted on a work string for use in a well borehole filled with fluid. It can be a work string designed for drilling, re-entry work, or workover applications. As required for many of these applications, the work string may be one capable of going into highly deviated holes, horizontally, or even uphill. Therefore, in order to be fully useful to accomplish the purposes of the present invention, the work string must be one that is capable of being forced into the hole, rather than being dropped like a wireline. The work string can contain a Measurement While Drilling (MWD) system and a drill bit, or other operative elements. The formation test apparatus may include at least one expandable packer or other extendable structure that can expand or extend to contact the wall of the well borehole; device for moving fluid such as a pump, for taking in formation -fluid; a non-rotating sleeve; an extendable stabilizer blade; a coring device, and at least one sensor for measuring a characteristic of the fluid or the formation. The test apparatus will also contain a controller, for controlling the various valves or pumps which are used to control fluid flow. The sensors and other instrumentation and control equipment must be carried by the tool. The tool must have a communication system capable of communicating with the surface, and data can be telemetered to the"} +{"output_text": ".\nIn the case of a lens system that is equipped with a drive mechanism that drives the lens to be displaced in the optical axis direction, the lens is driven to be displaced in the optical axis direction by the drive mechanism. Therefore, the lens is displaced in the optical axis direction by the drive mechanism even when the lens is not driven to be displaced in the optical axis direction by the drive mechanism. Accordingly, the lens is displaced in the optical axis direction by the drive mechanism even when the lens", "input_text": " thereby resulting in mismatch in timing of the operation of the base printer and the operation of the scanning mechanism. Such a time-mismatch may cause additional wear and acoustic noises and may lead to thermal problems.\nAn existing solution to the aforementioned problems is to use a single pass ADF that includes a second scan bar fixed within a feed path loop of the ADF, thereby allowing capturing of both sides of a media sheet in a single pass. However, employing a second scan bar increases cost. Further, such an ADF allows for generating images at a speed much faster than the processing speed of a base printer engine associated with the ADF. Accordingly, a scanner mechanism of such an ADF is often required to transit between a \u2018start\u2019 mode and a \u2018stop\u2019 mode as the base printer engine processes scanned images at a slower pace.\nAccordingly, there is a need for an efficient and a cost-effective media retractor and recycler that facilitates in achieving a sufficiently high throughput during a duplex scanning or a duplex printing and facilitates in reducing inter-page gap between consecutive media sheets. The widespread availability of camera phones having a camera function in recent years has increased the opportunities for users to photograph various kinds of photographic subjects. For example, a photographic subject at a distance from the camera lens, such as a friend or scenery, is photographed (normal snapshot) or a photographic subject at a close distance from the camera lens, such as a bus time schedule or flower petals, is photographed (close-up photography).\nFor close-up photography (macro photography), the camera lens needs to be positioned slightly closer to the photographic subject than for a normal snap shot. Therefore, a photographing lens system of this kind is equipped with a drive mechanism that drives the lens to be displaced in the optical axis direction; by switching a switch, the drive mechanism is driven to move the lens in the optical axis direction"} +{"output_text": "LTE) access network. SRVCC is a mechanism for allowing a voice call to be established and maintained between a user equipment (UE) and a user (U) when the UE is in a connected mode and the UE is moving from a first access network to a second access network. The SRVCC mechanism is based on the IMS architecture and is defined in the 3GPP TS 23.237 v11.4.0.\nThe SRVCC mechanism is based on the", "input_text": " direction of the vehicle, an impact force can be absorbed by a breakdown of the wall section which is substantially perpendicular to the running direction of the vehicle in the case of a car collision. Therefore, deformation of the wall section which is substantially horizontal to the running direction of the vehicle can be reduced to as small as possible. For example, damage given to the injector and the fuel tube, which are attached to the wall section of the connecting section with the internal combustion engine, can be reduced.\nIn the wall section in the suction device used for an internal combustion engine of the seventh embodiment of the present invention, when the suction device is given an impact force from the front in the case of a car collision, a breakdown is caused in a transition region which is formed from a portion substantially perpendicular to the running direction of the vehicle, the cross section of which is formed into a substantial semicircle, to a portion substantially horizontal to the running direction of the vehicle. Due to the foregoing, deformation of the wall section of the connecting section of the suction device with the internal combustion engine can be reduced to as small as possible.\nThe present invention will be more fully understood from the description of preferred embodiments of the invention set forth below, together with the accompanying drawings. IP Multimedia Subsystem (IMS) is a standardised and established architecture for delivering IP multimedia services to end users. IMS is to a large extent agnostic concerning the access network used by the end users: access networks may be wireless or fixed line. In the context of IMS, it is important to allow end users to seamlessly move between access networks and access technologies, e.g. to allow voice and video call continuity during such movements.\n3GPP TS 23.237 v11.4.0 specifies Single Radio Voice Call Continuity (SRVCC) as a functionality defined for the Long Term Evolution ("} +{"output_text": ".\nNatural gas is a cleaner burning fuel than diesel, and is less expensive. However, natural gas is not always available at a wellsite, and may not be available at all in some areas. Natural gas is also not always available at a wellsite in sufficient quantities to power all of the fracturing equipment.\nIn addition, natural gas is not always available at a wellsite in sufficient quantities to power all of the fracturing equipment. Natural gas is also not always available at a well", "input_text": " 112.\nThe extent to which the channel width L can be decreased is limited in part by restraints associated with photolithography techniques. Likewise, simply increasing the width W of the channel 116 in a conventional manner by extending the gate structure 114 reduces the number of transistor that can be formed in a given area of a semiconductor substrate. Thus, increasing the width to length ratio W/L in a planar transistor can be difficult. 1. Technical Field\nThis disclosure relates generally to hydraulic fracturing and more particularly to systems and methods for spare turbine power generation, which is sometimes referred to as reserve power.\n2. Background\nWith advancements in technology over the past few decades, the ability to reach unconventional sources of hydrocarbons has tremendously increased. Horizontal drilling and hydraulic fracturing are two such ways that new developments in technology have led to hydrocarbon production from previously unreachable shale formations. Hydraulic fracturing (fracturing) operations typically require powering numerous components in order to recover oil and gas resources from the ground. For example, hydraulic fracturing usually includes pumps that inject fracturing fluid down the wellbore, blenders that mix proppant into the fluid, cranes, wireline units, and many other components that all must perform different functions to carry out fracturing operations.\nUsually in fracturing systems the fracturing equipment runs on diesel-generated mechanical power or by other internal combustion engines. Such engines may be very powerful, but have certain disadvantages. Diesel is more expensive, is less environmentally friendly, less safe, and heavier to transport than natural gas. For example, heavy diesel engines may require the use of a large amount of heavy equipment, including trailers and trucks, to transport the engines to and from a wellsite. In addition, such engines are not clean, generating large amounts of exhaust and pollutants that may cause environmental hazards, and are extremely loud, among other problems"} +{"output_text": " portionxe2x80x9d.\nIn the present invention, the water soluble resin is preferably contained more in the inner portion than in the intermediate portion.\nIn the present invention, the water soluble resin is preferably contained more in the inner portion than in the intermediate portion.\nIn the present invention, the water soluble resin is preferably contained more in the inner portion than in the intermediate portion.\nIn the present invention, the water soluble resin is preferably contained more in the inner portion than", "input_text": " a problem in view of safety to human skins.\nThen, in Japanese Patent Laid-Open No. 228214/1997, it is intended to improve the strength and make the water decomposability favorable by selecting the fiber length of the regenerated cellulose fibers. However, it is actually difficult to appropriately make a balance between the strength and the water decomposability. Moreover, since the entire strength is intended to be obtained merely by the entangled state of the fibers, the surface strength of the non-woven fabric is extremely low and the non-woven fabric involves a problem that the fibers appearing on the surface drop off during wiping operation or the surface of the non-woven fabric is broken easily.\nThe present invention intends to overcome the foregoing problems in the prior art and it is an object thereof to provide a cleaning article by using a non-woven fabric of satisfactory water decomposability, in which the surface strength of the non-woven fabric is increased thereby enabling to prevent fluffing on the surface and dropping of fibers upon wiping operation and, further, prevent breakage on the surface, as well as a manufacturing method thereof\nIn accordance with the present invention, the foregoing object can be attained by a cleaning article comprising a water-decomposable non-woven fabric containing water dispersible fibers and a water soluble resin coated on at least one side of the water-decomposable non-woven fabric, in which the water soluble resin is contained more in a surface portion of a fiber assembly than in a remaining portion of the fiber assembly.\nHere, when the water soluble resin is coated on both sides of the non-woven fabric, the remaining portion of the fiber assembly, as sandwiched between two surface portions, may be called xe2x80x9cinner portionxe2x80x9d or xe2x80x9cintermediate"} +{"output_text": " the acoustic energy in the switch to the acoustic energy in the substrate is relatively low.\nA touch switch such as shown in U.S. Pat. No. 5,748,583 includes a piezoelectric transducer mounted on a surface of a substrate opposite a touch surface of the substrate. The transducer generates an ultrasonic wave that propagates in a direction across the thickness of the substrate to the touch surface and reflects off of the touch surface back to the transducer. The ultrasonic wave appears to be a", "input_text": " conductive fluids and/or an ionizing atmosphere and can be made inoperable thereby. Further, the enclosure through which touch is sensed cannot be made of an electrically conducting material, so that metals and the like cannot be used. Piezoelectric switches such as supplied by Schurter or Wilson-Hurd, operate by transferring finger pressure via a metal overlay to a piezoelectric element which generates a voltage when compressed. This type of switch is expensive compared to a standard membrane switch and shares the disadvantages of membrane switches in that holes in the housing or enclosure are required to accommodate the switch. Further, the metal overlay is necessarily thin, so that the piezoelectric element is relatively unprotected against blows to the overlay. Another type of switch shown in U.S. Pat. No. 5,149,986 is based on the absorption of sound in a glass, ball-shaped button when the button is touched. In operation, a transducer sends sound waves into the glass balls and then receives back the echoes in a sonar type fashion. A circuit analyzes the echoes to determine whether the echoes have been reduced indicating a touch. This type of switch is relatively expensive and again requires openings in the housing or enclosure in which the switch is to be mounted.\nAn acoustic wave switch such as shown in U.S. Pat. No. 5,673,041 includes an ultrasonic piezoelectric transducer mounted on a surface of a substrate opposite a touch surface of the substrate. The transducer generates an ultrasonic wave that propagates in a direction across the thickness of the substrate to the touch surface and reflects off of the touch surface back to the transducer. The ultrasonic wave appears to be a compressional wave. A touch on the touch surface changes the acoustic reflectivity of the surface and changes the impedance of the transducer. The acoustic energy in this switch is not confined and spreads out into the plane of the substrate. As such, the ratio of"} +{"output_text": " xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x", "input_text": " to the case of explicit methods mentioned above, this difference equation (7) can be rearranged as follows, by letting r=xcex94t/xcex94x2.\nxe2x88x92r\u0192i+1n+1+(1+2r)\u0192in+1xe2x88x92r\u0192ixe2x88x921n+1xc3x97\u0192inxe2x80x83xe2x80x83(8)\nThis equation (8) is an implicit expression of the problem to be solved. Unlike the explicit methods, the numerical stability is guaranteed when solving this implicit expression. In the implicit method, however, it is necessary to solve the following set of simultaneous equations in order to obtain a series of fin+1. ( b 1 c 1 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 a 2 b 2 c 2 xe2x80x83 0 xe2x80x83 xe2x80x83 xe2x80x83 \u22f0 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 xe2x80x83 \u22f0 xe2x80x83 xe2x80x83 "} +{"output_text": " These separators are based on the principle of inertial impaction. The inertial impaction separators are designed to remove droplets of a certain size by using a high velocity gas stream to accelerate the droplets to a high velocity. The droplets are then separated from the gas stream by a series of baffles or plates. The most common type of inertial impaction separators are cyclones. Cyclones are designed to remove droplets of a certain size by using a high velocity gas stream to accelerate the droplets to a", "input_text": " in order to remove liquid droplets from industrial gas streams to satisfy environmental standards (e.g., radioactive water from steam at nuclear power plants) or to purify gas streams, increase liquid recovery, and to protect rotating equipment located downstream (e.g., oil processing facilities, engine air intakes, gas processing plants). A complete phase separation will eventually occur without employing any mechanical devices given the effects of gravity and long contact times; however, to accelerate this process several separation techniques have been proposed. These techniques operate based on one or more physical forces accelerating fluid separation, such as inertial, gravitational, diffussional, centrifugal and electrostatic. Mechanical equipment operating on these principles include impingement separators (baffle, wire mesh, vanes), cyclones, knock-out pots, and filters, as described in U.S. Pat. No. 6,017,377, and wet precipitators, as described in U.S. Pat. No. 5,843,210.\nThe above separation techniques are selected based on the liquid collection efficiency requirement, gas flow rate and liquid loading, solid deposition tolerance, pressure drop, and capital cost. There is a need to develop liquid/gas separators that will achieve high level of liquid removal efficiency and throughput and at the same time minimize the amount of energy that is required to treat the gas (pressure drop) and minimize capital cost.\nOne of the most widely used gas/liquid separators are impingement separators. The basic elements of impingement separators are strategically located devices (targets) on which liquid droplets collide. The simplest impingement separators consist of a baffle or disk inserted against the vessel inlet. These separators provide low droplet removal efficiency but can remove bulk of the liquid entering the vessel. To improve efficiency and recovery of smaller droplets more sophisticated impingement separators have been developed."} +{"output_text": " space for storing the signal received from a specific electrode; (7) the allocation of memory space for storing the signal received from a specific electrode; (8) the allocation of memory space for storing the signal received from a specific electrode; (9) the allocation of memory space for storing the signal received from a specific electrode; (10) the allocation of memory space for storing the signal received from a specific electrode; (11) the allocation of memory space for storing the signal received from a specific electrode", "input_text": " the patient\"\"s physician on a regular basis (e.g., every three months, or more frequently if the device did not promptly terminate some neurological event). It is also anticipated that the patient could use a patient\"\"s initiating device to trigger the retention of several minutes of data recording of the EEG signal from a pre-selected group of electrodes.\nIt is also conceived that certain other data be recorded that can be helpful to the physician for treating the patient. These additional data would include: (1) the number of neurological events detected since the last memory readout and; (2) the number of responses triggered by the neurological events that were delivered to the patient. Furthermore, the system can be programmed so that when a neurological event is detected, the electrical signal from any one or more of the multiple steps in the signal conditioning can be stored in a digital memory. Additionally, telemetry would be provided to the physician that would indicate the serial number of the device that is implanted in the patient and the date and time that each neurological event or patient initiated recording occurred.\nAnother valuable attribute of the present invention is the capability to program the functions and parameters of the system to enhance the detection of a neurological event and to optimize the system responses for stopping a neurological event such as an epileptic seizure. Examples of programmable functions and parameters arc: (1) the time delay introduced for a signal being received from a specific electrode; (2) the use or non-use of a specific electrode; (3) the frequency response characteristic of the channel assigned to process the signal received from a specific electrode; (4) whether or not a particular electrode is electrically shorted to another electrode or to the metal case of the device after a neurological event has been detected; (5) the amplitude, frequency, duration, phase and wave-form of the response signal delivered to a specific electrode; (6) the allocation of memory"} +{"output_text": " by the thickness of the nitride layer. The nitride layer was removed by wet etching and the oxide layer was removed by dry etching. The oxide layer was then removed by wet etching. The oxide layer was removed by wet etching. The oxide layer was removed by wet etching. The oxide layer was removed by wet etching. The oxide layer was removed by wet etching. The oxide layer was removed by wet etching. The oxide layer was removed by wet etching. The oxide layer was removed by wet etching. The", "input_text": "Independent Gate Length, IDEM, 1999, p. 75, the disclosure of which is hereby incorporated herein by reference in its entirety. As described in the abstract of Hergenrother et al., the VRG MOSFET is the first MOSFET ever built that combines 1) a gate length controlled precisely through a deposited film thickness, independently of lithography and etch, and 2) a high-quality gate oxide grown on a single-crystal Si channel. In addition to this unique combination, the VRG-MOSFET includes a self-aligned S/D formed by solid source diffusion (SSD) and small parasitic overlap, junction, and S/D capacitances. The drive current per xcexcm of coded width is significantly higher than that of advanced planar MOSFETs because each rectangular device pillar (with a thickness of minimum lithographic dimension) contains two MOSFETs driving in parallel. All of this is achieved using current manufacturing methods, materials, and tools, and competitive devices with 50-nm gate lengths (LG) having been demonstrated without advanced lithography. See the Hergenrother et al. abstract.\nAs also described in the xe2x80x9cDevice Fabricationxe2x80x9d section of Hergenrother et al., in the VRG process, arsenic was implanted into an epi Si wafer to form the device drain and a thin oxide diffusion barrier was deposited. A PSG/nitride/undoped oxide/nitride/PSG/nitride stack was deposited and a trench (or window) with nearly vertical sidewalls was etched through the entire stack. The boron-doped epitaxial Si device channel was grown selectively in this trench and the channel was planarized to the top nitride layer by CMP. The undoped oxide film in the stack was a sacrificial layer whose thickness was defined"} +{"output_text": " the level regulator, and the output voltage from the level regulator is input to a linear element of the level regulator. The output voltage from the level regulator is used to supervise the input level.\nHowever, the above-mentioned conventional methods of the art have the following disadvantages.\nIn the conventional method of the art, the input level is supervised by the control voltage for driving the variable element of the level regulator. The control voltage is a voltage which varies approximately in accordance with a decibel change", "input_text": " a signal for lighting an alarm lamp and ringing an alarm bell when the pilot signal level exceeds a range that can be regarded as normal, and; for a pilot signal level indicating function of constantly indicating and recording the pilot signal level through a meter or a recorder for improved maintenance.\nGenerally, supervision of the pilot signal level includes supervision of the level of input to the level regulator and supervision of the level of output therefrom. Of these types of supervision, the output level supervision has conventionally been employed. The two types of supervision are different in the supervising accuracy of the input level. It goes without saying that it is more desirable for higher accuracy to supervise the input level than to supervise the output level which has the same relation to the input level in a compressed state. Further, the International Telegraph and Telephone Consultative Committee recommends the input level supervision as a preferable one. In a conventional method of this art, the input level supervision uses an electric voltage which varies approximately in accordance with a decibel change in the level of input to the level regulator, which voltage is available from the control voltage for driving the variable element of the level regulator or a like voltage, thus supervising a pseudo input level. To be concrete, if there is a change in the input level, the change is detected so that the gain is controlled by the thermistor or the FET of the level regulator. Since the output voltage from the control circuit for driving the thermistor or the FET is somewhat mutually related to the input level, said output voltage, i.e., control voltage, is directly used to supervise a pseudo input level.\nFurther, an improved method of the above-mentioned conventional supervising method has been proposed by German Patent P 22 35 230.3, owned by Siemens Aktiengesellischaft. According to this method, said control voltage is input to a non-linear element of"} +{"output_text": " f i + 1. \u2062 j n ", "input_text": " given. \u2202 f \u2202 t = \u2202 2 \u2062 f \u2202 x 2 + \u2202 2 \u2062 f \u2202 y 2 ( 10 ) \nTo solve this equation (10) with, for example, the Crank-Nicolson method, its time derivative term is approximated as follows. f i . j n + 1 - f i . j n \u0394 \u2062 xe2x80x83 \u2062 t = xe2x80x83 \u2062 1 2 \u2062 { f i + 1. \u2062 j n - 2 \u2062 f i . j n + f i - 1. \u2062 j n \u0394 \u2062 xe2x80x83 \u2062 x 2 + f i . j + 1 n - 2 \u2062 f i . j n + "} +{"output_text": "(CH3)2, \u2014S\u2014C(CH3)3, \u2014NH\u2014CH3, \u2014NH\u2014C2H5, \u2014NH\u2014CH(CH3)2, \u2014NH\u2014C(CH3)3, \u2014N(CH3)2, \u2014N(C2H5)2, \u2014N(CH(CH3)2)2, \u2014N(C(CH3)3)2, \u2014N(CH3)(C2", "input_text": " (2,3)-dihydrobenzo[1.4]dioxinyl, benzo[1.3]dioxolyl, (1,4)-benzodioxanyl, (2,3)-dihydrothieno[3.4-b][1.4]dioxinyl, (3,4)-dihydro-2H-benzo[1.4]oxazinyl, octahydro-1H-isoindolyl, and octahydropyrrolo[3.4-c]pyrrolyl.\n(Hetero)cycloaliphatic radicals can form, within the scope of the present invention, a spirocyclic radical with another (hetero)cycloaliphatic radical via a carbon atom common to both rings.\nExamples of suitable spirocyclic radicals include a 6-azaspiro[2.5]octyl radical, an 8-azaspiro[4.5]decyl radical and a 1-oxa-2,8-diazaspiro[4.5]dec-2-enyl radical. More preferably the (hetero)cycloaliphatic radicals can each be optionally substituted by 1, 2, 3, 4, or 5 substituents independently selected from the group consisting of oxo (\u2550O), thioxo (\u2550S), F, Cl, Br, I, \u2014CN, \u2014CF3, \u2014SF5, \u2014OH, \u2014O\u2014CH3, \u2014O\u2014C2H5, \u2014O\u2014CH(CH3)2, \u2014O\u2014C(CH3)3, \u2014NH2, \u2014NO2, \u2014O\u2014CF3, \u2014S\u2014CF3, \u2014SH, \u2014S\u2014CH3, \u2014S\u2014C2H5, \u2014S\u2014CH"} +{"output_text": " a \"phase boundary\" between the ferrite and austenite phases. This boundary is a line of instability in the composition diagram. The boundary is a line of instability in the composition diagram because the composition of the alloy is such that the alloy is unstable at the boundary. The boundary is a line of instability because the alloy is unstable at the boundary. The boundary is a line of instability because the alloy is unstable at the boundary. The boundary is a line of instability because the alloy is unstable", "input_text": " increasing amounts of power. However, conductive disks formed from any material with tensile and shear strength comparable to that of copper, silver, and aluminum and having a conductivity of at least 270 kilo mho-cm will work.\nUsing an automated hole punching press, disks can be fabricated with tight mechanical tolerances, typically on the order of+/xe2x88x920.0005 inches or less. By comparison, solder balls have diameter tolerances that typically vary from+/xe2x88x920.0007 inch to+/xe2x88x920.003 inch. Tighter mechanical tolerances improve the coplanarity of components, which in turn improves solder joint uniformity, thereby further enhancing reliability. The disks are also lighter in weight than lead solder balls of equal diameter.\nIn addition to enhancing electrical and thermal conductivity and reliability in both favorable and unfavorable external environments, the conductive disks of the present invention offer an inexpensive method of replacing the conductive solder balls of a BGA with lead-free, more environmentally friendly metals. Disk grid arrays thus provide an economically feasible way to advance the lead-free initiatives advocated by many governments around the globe. The present invention can use xe2x80x9clead-freexe2x80x9d solder and can be easily applied to new ceramic chip scale packages (xe2x80x9cCCSPxe2x80x9d) and plastic grid arrays (xe2x80x9cPGAxe2x80x9d).\nIn alternative embodiments of the present invention, the disks may be attached by solder, conductive adhesives, or socket or compression fittings. As pointed out in U.S. Pat. No. 3,806,336 issued Apr. 23, 1974, it is known that the iron/chromium alloy system has, in its composition diagram, a \"limit of metastability\" or"} +{"output_text": " advantages of the conventional processes of drying and passivation of coal with the advantages of the novel process of this patent application. The novel process of this patent application provides a process for reducing the predisposition of coal to self-heat in the presence of oxygen. This novel, cost-effective and efficient process for irreversible drying and passivation of coal combines the advantages of the conventional processes of drying and passivation of coal with the advantages of the novel process of this patent application. The novel process of this patent", "input_text": " of gas such as atmospheric air. The entire disclosure of said United States patent is hereby incorporated by reference into this specification.\nU.S. Pat. No. 4,043,763 (stabilization of dried coal) discloses a process of combining completely or partially dried coal with as-mined coal in a weight ratio of 1:2 to 10:1. The entire disclosure of said United States patent is hereby incorporated by reference into this specification.\nU.S. Pat. No. 3,723,079 (Stabilization of coal) discloses a process of treating dried coal with 0.5-8% oxygen by weight at a temperature of 175\u00b0 C. to 225\u00b0 C. and rehydrating the coal with water of from 1.5%-6% by weight of oxygen treated coal. The entire disclosure of said United States patent is hereby incorporated by reference into this specification.\nU.S. Pat. No. 4,249,909 (Drying and passivating wet coals and lignite) discloses a staged process of heating under low partial pressure of moisture to 8-12% moisture content then heated to a lower differential vapor pressure to remove additional moisture. The entire disclosure of said United States patent is hereby incorporated by reference into this specification.\nU.S. Pat. No. 3,896,557 (Process for drying and stabilizing coal) discloses a process of heating the coal in a fluidized combustion gas streat containing 7-9% by volume of oxygen to reduce moisture content to 8-12% by volume. The entire disclosure of said United States patent is hereby incorporated by reference into this specification.\nThe novel process described in this patent application provides a process for reducing the predisposition of coal to self-heat in the presence of oxygen. This novel, cost-effective and efficient process for irreversible drying and passivation of coal combines the"} +{"output_text": " at least reduce the disadvantages of the prior art.\nThe invention relates to a method for the production of a semiconductor device, and more particularly to a method for the production of a semiconductor device having a multilayer wiring structure.\nIn recent years, the degree of integration of semiconductor devices has been increased, and the number of layers of wiring has been increased. In order to increase the number of layers of wiring, it is necessary to reduce the thickness of the interlayer insulating film. However, when", "input_text": ", A320, and A321 single aisle program series.\nIn order to ensure the safety of passengers who are to board or deplane through the aircraft door, both in the normal course of a flight and in an emergency evacuation situation, corresponding safety regulations require that the area outside of the door must be within view and visually inspected before the door is opened. For example, this means that a boarding stairway, boarding ramp or covered jetway that has been moved to the outside of the fuselage (in the usual situation), or an emergency evacuation slide that has been deployed (in an emergency situation), must be visible to the aircraft crew through the window of the door before the door is opened.\nDue to the above described upward facing tilt of the windows, the field of view out through the window is limited, especially in a downward direction toward the area at the bottom of and below the door, where a stairway, evacuation slide, or the like would be arranged. For this reason, aircraft manufacturers have tried to improve the field of view through the lower portion of the window area, for example by arranging a thick heavy prism in this lower window area. Alternative attempts to improve the field of view in the lower window area have involved complicated mirror mechanisms or full solid lenses arranged to improve the view through the bottom area of the door window. Such prior art solutions have not been satisfactory, because they are costly, complicated, and do not provide a satisfactory improvement of the overall field of view.\nIn view of the above, it is an object of the invention to provide a window arrangement for an aircraft door, that provides an adequate field of view and adequate visibility of the surrounding area outside of the aircraft door, and especially to the area or fuselage section below the window, without interference or disagreeable distortion or limitation of the view, and in a simple economical manner. The invention further aims to avoid or"} +{"output_text": " drier, and the pigments are dried in the drier.\nGerman Offenlegungsschrift No. 25 50 070 describes a method of preparing finely divided, readily dispersible calcium carbonate, in which the calcium carbonate is passed into a drier, the surface-active agent is introduced directly into the charging zone of this drier, and the calcium carbonate is dried in the drier.\nGerman Offenlegungsschrift No. 25 50 071 describes a", "input_text": " monomer or polymer substances which facilitate the dispersion of particles in a dispersion agent by reducing the surface tension between the two components, i.e. by wetting.\nDispersion media are substances with the property of preventing agglomerations or aggregations of the particles of pigments, fillers or resins and bringing these into a fine distribution or also countering in advance a tendency to agglomeration or even precipitation. They have surface-active properties and are used in the preparation (grinding) of fillers and dye pigments for coating compositions; and for a better distribution of resin-free pulps. Among those that can be mentioned are sodium hexametaphosphate, sodium pyrophosphate, alkylphenol-polyglycol ether, and alkyl-aryl sulphonic acid salts. The use of them should be confined to the smallest concentrations.\nReadily dispersible inorganic pigments, the surfaces of which bear non-drying, fatty-acid-modified alkyd resins based on polycarboxylic acids, polyols and fatty acids with more than 6 C atoms, are already known from German Auslegeschrift No. 20 01 381.\nGerman Offenlegungsschrift No. 24 56 463 discloses a method of preparing fine-grained calcium carbonate dispersions by homogenization of precipitated calcium carbonate in the presence of a dispersion medium into a suspension containing 15-25% water and subsequent wet grinding of the suspension, in which, for the purpose of producing readily dispersible calcium carbonate of a high degree of fineness and density, the dispersion is transformed into a free-flowing product.\nGerman Pat. No. 29 08 699 describes a method of producing powder pigments with improved dispersion properties, in which the pigments are passed by known methods into a drier, the surface-active agent is introduced directly into the charging zone of this"} +{"output_text": "/min. The dosage of TMC required for the same lowering of blood pressure is on average about 0.5 mg/kg/min.\nThe dosage of SNP required for the controlled lowering of blood pressure is about 10 times as high as the dosage of TMC required for the same lowering of blood pressure.\nThe dosage of SNP required for the controlled lowering of blood pressure is about 100 times as high as the dosage of TMC required for the same lowering of blood pressure.\nThe dosage", "input_text": ". 302, 1029-1030 (1980); Anesthesiology 44, 345-348 (1976)). Since this \"rebound\" hypertension occasionally causes blood pressure levels which lie far above the initial blood pressure, secondary bleedings can occur in newly operated patients and dangerous blood perfusion disorders in the brain owing to oedema formation can occur in predisposed patients.\nSince, on the other hand, SNP is at present the most active agent for the controlled lowering of blood pressure, e.g., during operations, attempts have been made to eliminate the mentioned disadvantages.\nMacRae has recently proposed (Anaesthesia 36, 312-315 (1981)) to infuse a very dilute solution containing SNP together with the ganglionic blocking agent trimethaphan camsylate (TMC), in the weight ratio 1:10. He reported that thereby the amount of SNP required for the same lowering of the blood pressure was considerably lower.\nTMC and its blood pressure-lowering activity are known and TMC is therefore employed therapeutically (in spite of its lower activity) similarly to SNP, i.e., as an infusion preparation for the controlled short-term lowering of blood pressure. However, TMC displays, in turn, a series of side effects which restrict its use.\nThus, in addition to such side effects as tachycardia, mydriasis, cycloplegia, urine retention, xerostomia and constipation, which occur by blockade of the parasympathetic ganglia, nausea or vomiting can arise in sensitive patients and, especially in children and aged patients, allergies can arise owing to histamine liberation.\nMoreover, trimethaphan camsylate must not be used alone in the case of operations in the region of the gastrointestinal tract.\nThe dosage of SNP required for the controlled lowering of blood pressure is on average about 3 ug/kg"} +{"output_text": "-delayed communication mode that typically involves a one-to-one communication. Even though some software providers have offered solutions that allow a user to send one short message to multiple participants, such is not the same as real time voice communication between these same users.\nIn addition, the prior art has not provided a solution for a user to quickly and easily assemble a multi-user communication session that is hardware independent and, further, does not require the user to purchase additional hardware.\nIn addition,", "input_text": ", the most important enhancement in the cell phone, at least as it relates to interpersonal communication, has been the development of the capability of sending short text messages from one phone to another.\nOtherwise, the main improvements in communications have been largely concerned with connectivity. For example, communications protocols such as infrared and Bluetooth have become de facto requirements for all but the most inexpensive phones. In addition advances have been made in connectivity to the Internet (for example) and now it is routine for users to be able to access their e-mail and browse the web via their phones.\nHowever, these improvements in connectivity, as welcome as they might be, do not expand on the one-to-one personal communication aspect of the phone. One thing that would be a leap forward in such communications would be the ability to quickly and easily assemble a multi-user communication session that is hardware independent and, further, does not require the user to purchase additional hardware. Although the prior art has provided multi-user communications in the form of, for example, conference calls\u2014the present technology of conference calls is quite limiting to the user. For example, it is typically limited to a predetermined number of user connections (e.g., 5). Further, a start time must be communicated to each user so there is little opportunity for spontaneity. Further, adding more users to the session may be very difficult or impossible. Finally, the conference call will ultimately be limited to known users, i.e., those who are known to one of the participants and have been invited.\nAdditionally, exchanging short messages between users is a time-delayed communication mode that typically involves a one-to-one communication. Even though some software providers have offered solutions that allow a user to send one short message to multiple participants, such is not the same as real time voice communication between these same users. Of course, such group messaging is a time"} +{"output_text": " lamps are also known. However, these bulbs are also problematic in that they are not as efficient as fluorescent bulbs, and they also contain toxic materials.\nIn addition to the above-mentioned problems, fluorescent bulbs are also problematic in that they are not as efficient as incandescent bulbs. Fluorescent bulbs are also not as efficient as other types of lamps, such as halogen lamps, which are more efficient than fluorescent bulbs.\nIn addition to the above-mentioned problems, fluorescent bulbs are also problematic", "input_text": " burns, so the UV component of the light must be converted into visible light. The inside of a fluorescent tube is coated with a phosphorescent material, which when exposed to ultraviolet light glows in the visible spectrum. This is similar to many glow-in-the-dark toys and other devices that incorporate phosphorescent materials. As a result, the illumination from a fluorescent light will continue for a significant time, even after electrical power is discontinued, which for the purposes of the present disclosure will be understood to be the latent period or latency between the change in power status and response by the phosphor. As the efficiencies and brightness of the phosphors has improved, so in many instances have the delays in illumination and extinguishing, or latency, increased. Through the selection of ones of many different modern phosphorescent coatings at the time of manufacture, fluorescent bulbs may manufactured that produce light from different parts of the spectrum, resulting in manufacturing control of the color temperature, or hue or warmness of a bulb.\nThe use of fluorescent bulbs, even though quite widespread, is controversial for several reasons. One source states that all pre-1979 light ballasts emit highly toxic Polychlorinated BiPhenyls (PCBs). Even if modern ballasts are used, fluorescent bulbs also contain a small but finite amount of mercury. Even very small amounts of mercury are sufficient to contaminate a property. Consequently, both the manufacture and disposal of mercury-containing fluorescent tubes is hazardous. Fluorescent lighting has also been alleged to cause chemical reactions in the brain and body that produce fatigue, depression, immuno-suppression, and reduced metabolism. Further, while the phosphor materials may be selected to provide hue or color control, this hue is fixed at the time of manufacture, and so is not easily changed to meet changing or differing needs for a given building space.\nOther gaseous discharge bulbs such as halide, mercury or sodium vapor"} +{"output_text": " epileptic focus, the signal received at that electrode would have a specific time delay. The signal received at a second electrode that is located at a different distance from the epileptic focus would have a different time delay. The signal received at a third electrode that is located at a third different distance from the epileptic focus would have a different time delay. The signal received at a fourth electrode that is located at a fourth different distance from the epileptic focus would have a different time delay. The signal received", "input_text": " epileptic seizures consistently originate from a single location within the brain. However, the system described herein is also adaptable for the treatment of a neurological event that involves a major portion or possibly all of the brain tissue.\nThe present invention also provides means for generating an ensemble of coordinated electrical stimuli designed to terminate the neurological event immediately upon (or even prior to) its onset. Thus, the present invention is a responsive detection and stimulation system for the early recognition and prompt treatment of a neurological event.\nThe present invention envisions a multiplicity of brain electrodes placed either within the brain, on the surface of the brain itself, or on the dura mater that surrounds the brain. Some one, several, or all of these brain electrodes can be used for detection of an abnormal neurological event such as an epileptic seizure. A responsive stimulation signal can also be applied to any one, several, or all elements of such an electrode array. The responsive stimulation signals sent to each electrode may be identical or they may be programmed to differ in amplitude, frequency, waveform, phase, and time duration. It is also envisioned that sensing electrodes may be entirely separate from the electrodes used for responsive stimulation.\nThe present invention envisions that a neurological event can be reliably detected in the presence of a normal EEG signal and in the presence of external noise by the use of modern and sophisticated signal processing techniques. Specifically, the electrical signal from an epileptic focus within a specific and limited spatial region within the brain can be reliably detected by combining the signals received at different electrodes that are placed at different distances from the epileptic focus. To improve signal-to-noise ratio, the signal received at a specified location that is at a specific distance from the epileptic focus could have a specific time delay to account for the propagation time it takes for the signal to reach that electrode. For example, if a first electrode is located directly over the site of the"} +{"output_text": " belt drive system having a damping mechanism which is effective to damp belt vibrations during high deceleration rates. What is also needed is an asymmetric damping tensioner belt drive system having a damping mechanism which is effective to damp belt vibrations during high deceleration rates without requiring a locking tensioner. What is also needed is an asymmetric damping tensioner belt drive system having a damping mechanism which is effective to damp belt vibrations during high deceleration rates without requiring a complex mechanical arrangement. What is also needed is an asymmetric damping", "input_text": " in the belt.\nRepresentative of the prior art is U.S. Pat. No. 5,439,420 to Meckstroth et al. which discloses an accessory drive system including a tensioner having a governor for controlling rotational motion of the arm with the arm being able to rotate freely in the direction in which tension of the belt is increased and with the governor resisting motion of the arm in the direction in which tension in the belt is decreased.\nThe prior art also teaches a method of arranging engine accessories so that the order of rotational interial force is greatest for the accessory nearest the crankshaft pulley as seen from the tight side of the belt. This is taught in U.S. Pat. No. 4,959,042 to Tanaka. This method does not rely on the operational characteristics of the tensioner, instead relying on the dynamics of the staggered order of the accessories based upon rotational interia.\nThe prior art systems depend upon a locking tensioner or upon a particular mechanical arrangement to address the problem of high rate of change of engine speed. Neither system solves the dual problems of preventing squeal during speed changes while continuing to damp belt vibrations. Further, the prior art systems, in the case of Mechstroth are complex and expensive, requiring complex mechanical devices to control the movement of a tensioner arm. The prior art systems are relatively large requiring room on the engine surface. The Tanaka method does not fully address the issue of high deceleration rates, relying instead on the arrangement of the components which does not fully defeat the tightening of the belt during deceleration.\nReference is also made to co-pending U.S. patent application Ser. No. 09/861,338 filed May 18, 2001 which discloses a tensioner having a damping mechanism.\nWhat is needed is an asymmetric damping tensioner belt drive system having an asymmetric damping tensioner"} +{"output_text": " article of the invention, the water soluble resin is preferably a water soluble resin having a glass transition temperature of 50xc2x0 C. or more. When the water soluble resin has a glass transition temperature of 50xc2x0 C. or more, the water soluble resin is not decomposed by the drying drum and the water soluble resin is not decomposed by the water-decomposable non-woven fabric. Therefore, the water soluble resin is not decomposed by the water-decomposable", "input_text": " the cleaning article in a dry state as measured according to a KES bending test is from 0.05 or more to 1.0 or less. In the invention, used is the bulky non-woven fabric of a low density and therefore, the rigidity is not excessive and the softness is excellent. In addition, even for such bulky non-woven fabric of a low density, because the solution of the water soluble resin at a high viscosity is coated on the surface of the non-woven fabric thereby forming the water soluble resin-containing surface layer, the rigidity (B value) of 0.05 or more as described above can be attained.\nWhen the cleaning article of the invention is prepared for use in a wet (moistened) state, an insolubilizing agent for the water soluble resin is preferably added. This can maintain the wet (moistened) strength of the cleaning article at a high level. However, the cleaning article of the invention may be used in a dry state as it is.\nIn such a wet state, the cleaning article of the invention preferably has such a softness that the B value (which indicates the bending rigidity) of the cleaning article in a wet state as measured according to a KES bending test is 0.03 or more. In this case, the upper limit is preferably 0.1 or less.\nFurther, when the water soluble resin is coated only on one side, it is preferred that the water soluble resin is coated on a surface of the water-decomposable non-woven fabric to be contacted by a drying drum for drying the water-decomposable non-woven fabric in a manufacturing process thereof. Because the surface becomes relatively smooth after in contact with the drying drum, the solution of the water soluble resin, when coated, less intrudes into the non-woven fabric.\nIn the cleaning"} +{"output_text": " the information concerning the presence of the data is registered in the management table, the device on the network to transmit the data corresponding to the management information.\nIn this data transferring and receiving method, the management table on the network is shared among the apparatuses connected to the network.\nIn the data transferring and receiving method, an input operation may be performed by a single input operation unit or a number of input operation units, each having an ID number. Upon performing an input operation by the input operation unit", "input_text": " network and storing it.\nIn this data receiving method, there is provided a management table on the network in which the management information representing the information concerning the presence of the data selected by the input operation unit is registered. The management table on the network is shared among the apparatuses connected to the network.\nIn the data receiving method, an input operation may be performed by a ingle input operation unit or a number of input operation units, each having an ID number. Upon performing an input operation by the input operation unit, the ID number of the operated input operation unit may be identified, and the management information registered in the management table in correspondence with the identified ID number may be checked, and a request may be provided to the device on the network indicated by the management information to transmit the data corresponding to the management information.\nIn this data receiving method, the management table in which the management information representing the information concerning the presence of the data selected by the input operation unit is registered is provided on the network according to the ID number of the input operation unit.\nAccording to a further embodiment of the present invention, there is provided a data transferring and receiving method including the steps of checking for, upon performing an input operation by an input operation unit, management information representing information concerning the presence of data registered in a management table provided on a network, registering in the management table, in a case where there is no management information registered in the checked management table and where the information concerning the presence of the displayed data is selected by the input operation unit, the management information representing the information concerning the presence of the selected data, and transferring, upon obtaining a data transmission request that has been provided by the input operation unit from a device on the network, data corresponding to the information concerning the presence of the data indicated by the registered management information to the device on the network, and requesting, in a case where the management information representing"} +{"output_text": "((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)propanamide; [13] 2-(4-Hydroxy-3-methoxyphenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)propanamide; [14] 2-(3,5-Difluorop", "input_text": "6-(trifluoromethyl)pyridin-3-yl)methyl)acetamide; [5] 2-(2,4-Difluorophenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)acetamide; [6] 2-(2,6-Difluorophenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-)methyl)acetamide; [7] 2-(2,5-Difluorophenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)acetamide; [8] 2-(4-Fluorophenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)acetamide; [9] 2-(4-Hydroxy-3-methoxyphenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)propanamide; [10] 2-(3,5-Difluorophenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)propanamide; [11] 2-(3,4-Difluorophenyl)-N-((2-(4-methylpiperidin-1-yl)-6-(trifluoromethyl)pyridin-3-yl)methyl)propanamide; [12] 2-(4-Fluorophenyl)-N-"} +{"output_text": " be used as a speaker.\nHowever, the electromagnetic actuator has a disadvantage in that it is difficult to obtain a high-frequency response. This is because the voice coil is a conductor and the magnetic circuit is a magnetic circuit that uses the magnet.\nIn order to solve this problem, a method of using a piezoelectric actuator has been proposed. The piezoelectric actuator is a device that uses a piezoelectric material such as PZT (lead zirconate titanate) or PZT-based", "input_text": " capacitance in an equivalent circuit. As a result of the step-shaped indentation of the ridge according to the invention, the ridge size is reduced. This causes a lowering of the electric field energy relative to a full size ridge. The capacitance therefore has a negative value.\nAccording to a preferred embodiment of the invention, the devices for exciting the emission of the waveguide wave are formed by a pin made of a conductive material arranged laterally next to the slot.\nWith respect to the longitudinal direction of the waveguide, the pin is preferably arranged at the level of the center of the slot.\nIn a preferred embodiment, the step-shaped indentation has a rectangular cross-section of a length D and a height H parallel and/or perpendicular to the longitudinal direction of the ridge. The length D and the height H of the step-shaped indentation are preferably selected such that C1+C2=0 applies, wherein C1 indicates the coupling devices to be represented as the capacitance C1 in an equivalent circuit, and C2 indicates the step-shaped indentation in the ridge to be represented as a negative capacitance C2 in an equivalent circuit.\nThe step-shaped indentation and the slot are preferably arranged with respect to the longitudinal direction of the waveguide in the center relative to one another.\nOther objects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings. Electromagnetic actuators have been generally utilized as driver components for acoustic elements such as speakers, due to their easy handling. An electromagnetic actuator comprises a permanent magnet, a voice coil, and a diaphragm, and causes a low-stiffness diaphragm that is made of an organic film and is fixed to the coil to vibrate, through the operation of a magnetic circuit in a stator which uses the magnet. Therefore, they present a reciprocal vibration mode and can"} +{"output_text": ".\nFIG. 2 is a circuit diagram of a conventional resolver. Referring to FIG. 2, the conventional resolver includes a first inverter IV1, a second inverter IV2, a third inverter IV3, a fourth inverter IV4, a fifth inverter IV5, a sixth inverter IV6, a seventh inverter IV7, an eighth inverter IV8, a ninth inverter IV9, a tenth inverter IV10, a first N", "input_text": " back with the output signals from the first and fourth NAND gates ND1 and ND4, respectively. Then, the second and third NAND gates ND2 and ND3 perform a NAND gate operation and supply the first output signals \u2018Y1\u2019 and \u2018Y0\u2019, respectively.\nDuring the operation, when the internal clock signal \u2018CLK\u2019 is at a low level, the resolver 10 supplies the deactivated first output signals \u2018Y0\u2019 and \u2018Y1\u2019 at a high level. Then, if the internal clock signal \u2018CLK\u2019 is changed to a high level, then the resolver 10 supplies the first output signals \u2018Y0\u2019 and \u2018Y1\u2019 according to the input signal \u2018D\u2019. For example, if the internal clock signal \u2018CLK\u2019 is at a high level and the input signal \u2018D\u2019 is at a high level, then the first positive output signal \u2018Y0\u2019 transitions to a low level opposite to the input signal D, and the first negative output signal \u2018Y1\u2019 transitions to a high level opposite to the first positive output signal \u2018Y0\u2019. Meanwhile, if the internal clock signal \u2018CLK\u2019 is at a high level and the input signal \u2018D\u2019 is at a low level, then the first positive output signal \u2018Y0\u2019 transitions to a high level opposite to the input signal \u2018D\u2019, and the first negative output signal \u2018Y1\u2019 transitions to a low level opposite to the first positive output signal Y0.\nHowever, for the resolver 10, the three-input NAND gates ND2 and ND3, the two-input NAND gates ND1 and ND4, and the inverter IV1 are needed. Specifically, since 11 PMOS transistors and 11 NMOS transistors are needed, the layout efficiency is poor. In addition, the response speed may be lowered due to the NAND gate operations with the feedback inputs"} +{"output_text": " of an image area on the basis of image data of a document and performing image processing based on a result of the identification. In this image processing apparatus, a document image is read by a scanner, and image data of the document image is converted into a binary image. Then, a predetermined area of the binary image is extracted, and the extracted area is binarized. The binarized area is compared with a predetermined area of a reference image, and the type of the image area is identified", "input_text": ". H. Bickel, R. Guthrie and G Hammersen), pp 259-270, Springer Verlag, Berlin 1980}. The dried blood spots are of great utilty because they facilitate the ability to ship, archive and perform multiple analyses on the same sample. More recently, the utility of such dried blood spots has been extended to tests involving DNA amplification and analysis (McCabe ERB. 1991. Utility of PCR for DNA Analysis from Dried Blood Spots on Filter Paper Blotters, in PCR Methods and Applications, Volume 1: pp 99-106). Application of the technique is limited, however, and has only been applied to analysis of blood samples. 1. Field of the Invention\nThe present invention relates to a document reader, an image forming apparatus, and an image processing method which are preferable applied to a scanner, a color digital copying machine, or a complex machine having an image area identification and adjustment function of identifying an image area and adjusting image data.\n2. Description of the Prior Arts\nRecently, there has been used a color image forming apparatus for forming a color image based on image data related to red (R), green (G), and blue (B) colors obtained from a colored document image or for forming a color image based on image data received from a printer controller (an external device) such as a server or a personal computer (hereinafter, referred to simply as PC). To form an optimum color image by means of this type of color image forming apparatus, it is necessary to identify a photographic image, a screened halftone image, or a character area on the basis of image data of a document and to perform image processing based on a result of the identification before the image formation.\nJapanese Unexamined Patent Publication (Kokai) No. Hei09-172544 (1997) discloses an image processing apparatus for identifying a type"} +{"output_text": " layers 102, and the solder balls 108 are connected to a mother board (not shown) through a solder resist film 107.\nIn the semiconductor device 120, the signal wiring layers 102 are connected to the ground plane 104 through the bump connection portions, and the ground plane 104 is connected to the mother board through the solder resist film 107. In this manner, the semiconductor device 120 is constructed.\nIn the semiconductor device 120, the signal wiring layers 102 are connected to the ground plane 104 through the", "input_text": " least a portion of the predetermined pattern in the green-sheet layers. The green-sheet layers are sintered together at a predetermined temperature for a predetermined amount of time to form a substantially monolithic structure having a micro-gas chromatograph column defined therein, with a porous plug, formed from the thick-film paste, disposed in the micro-gas chromatograph column. 1. Field of the Invention\nThe present invention relates to a semiconductor package and a semiconductor device and, more particularly, a semiconductor package of FBGA (Fine Pitch Ball Grid Array) type, or the like employed in the high frequency application, and a semiconductor device in which a semiconductor chip is packaged in the semiconductor package.\n2. Description of the Related Art\nIn recent years, in the high-frequency application semiconductor device employed in the telecommunication apparatus, etc., the signal speed is being increased highly, and such higher speed of the signal is restricted by the disturbance in the signal waveform. For this reason, the semiconductor device that can suppress the disturbance in the signal waveform even when the higher speed signal is applied is desired. Such semiconductor device has the FBGA type structure using two metal wiring substrates, for example. FIG. 1 is a sectional view showing a semiconductor device having the FBGA type structure in the related art, and FIG. 2 is partial plan view viewed from an A portion in FIG. 1.\nAs shown in FIG. 1, in a semiconductor device 120 of the FBGA package type in the related art, signal wiring layers 102 are formed on one surface of an insulating film 100, and a ground plane 104 is formed on the other surface to spread over the entire surface. The signal wiring layers 102 are covered with a solder resist film 106 except their bump connection portions. In this manner, a wiring substrate 105 is basically constructed. Then, solder balls 108 are mounted on the bump connection portions of the signal wiring"} +{"output_text": " points of E, the set is called a group. Moreover, the set of the points of E is called a group E.\nThe discrete logarithm problem on an elliptic curve is a problem to find a point P on E such that P=a\u00d7O+b, where a and b are elements of GF (p) and a\u00d7O+b is an element of E. Here, the point P is called a discrete logarithm base point.\nThe discrete logarithm problem on an elliptic curve", "input_text": " of the Prior Art\n1. Public-Key Encryption\nRecently, data communication based on computer technology and communication technology has become widely available, and in this data communication, a secret communication mode or a digital signature mode is used. Here, the secret communication mode is a mode to communicate without leaking communication contents to a person other than the other specified party of the communication. Moreover, the digital signature mode is a mode that shows the correctness of communication contents to the other party of the communication and certifies the identity of the originator.\nIn the secret communication mode or digital signature mode, an encryption mode called a public-key encryption is used. The public-key encryption is a mode to easily manage encryption keys that are different to each of the other parties of communication when the other parties of communication are many, to be an indispensable fundamental technology to communicate with the many other parties of communication. In the secret communication using the public-key encryption, an encryption key and a decryption key are different, and the decryption key is secret while the encryption key is public.\nAs a base of security of this public-key encryption, a discrete logarithm problem is used. As for the discrete logarithm problem, there are what is defined on a finite field and what is defined on an elliptic curve as representatives. Moreover, the discrete logarithm problem is described in detail in \u201cA Course in Number theory and Cryptography\u201d by Neal Koblitz, Springer-Verlag, 1987.\n2. The Discrete Logarithm Problem on an Elliptic Curve\nThe discrete logarithm problem on an elliptic curve is described below. Here, p is a prime number and an elliptic curve defined on a finite field GF (p) is E. When we think a set that is obtained by adding a formal point O to the whole points both of x coordinates and y coordinates of which belong to GF (p) among the"} +{"output_text": "elles, allowing the passage of molecules into the cells. Electroporation is a non-invasive technique that can be used to deliver molecules into cells. Electroporation is typically used to deliver molecules into cells that are difficult to access, such as cells in the brain, or to deliver molecules into cells that are not normally permeable to molecules.\nElectroporation is typically used to deliver molecules into cells that are difficult to access, such as cells in the brain, or to deliver molecules into cells", "input_text": "S. patent application Ser. No. 11/055,930 (now U.S. Pat. No. 7,850,645). These applications describe processes for delivering a therapeutic agent to a location within a blood vessel. The disclosures of these applications are expressly incorporated herein by reference.\nThe '603 application describes devices for delivery of therapeutic agents to a diseased location based on electric field effects (i.e., delivery is electrically assisted), such as iontophoresis, electroporation, or both. The '603 application generally relates to internal drug delivery devices which contain a source of therapeutic agents, electrodes and power sources for applying voltages across the first and second electrodes. The power sources may be adapted, for example, to promote electrically assisted therapeutic agent delivery within a subject, including electroporation and/or iontophoresis. The therapeutic agent sources are polymeric regions that contain one or more types of electrically conductive polymers and one or more types of charged therapeutic agents or are polymeric regions that contain one or more types of ion-conductive polymers and one or more types of charged therapeutic agents. By placing the therapeutic agent within a polymer region, movement of the therapeutic agent is restricted and thus more precise local dosing of the therapeutic agent is possible. This design is also advantageous in that it allows one to provide different therapeutic agents or different therapeutic agent dosages for different sections of the device, which can be beneficial in various instances (e.g., where vulnerable plaque is located on one side of a vessel).\nIontophoresis is an electrochemical process by which an electric field is used as a driving force to move a drug into a subject. This technique typically requires two or more electrodes for creating an electric field as well as a drug that carries a net electrical charge at the local physiological pH.\nElectroporation methods use short, high-voltage pulses to create transient pores in the cell membranes or in organ"} +{"output_text": ",8a-hexahydro-3-methyl-5-oxo-2-phenyl-2H-pyran-2-ylmethyl)heptanoic acid. It was described first time in U.S. Pat. No. 4,346,227.\nPitavastatin is a lactone form of mevastatin. They were described first time in European patent no. 304063 and U.S. Pat. No. 5,011,", "input_text": "))-1,2,3,7,8,8a-hexahydro-3,7-dimethyl-8-(2-(tetrahydro-4-hydroxy-6-oxo-2H-pyran-2-yl)ethyl)-1-naphthalenyl-2,2-dimethylbutanoate. It was described first time in U.S. Pat. No. 4,444,784.\nLovastatin is chemically (1S-(1alpha, 3alpha, 7beta, 8beta (2S*, 4S*) 8a beta))-1,2,3,7,8,8a-hexahydro-3,7-dimethyl-8-(2-(tetrahydro-4-hydroxy-6-oxo-2H-pyran-2-yl)ethyl)-1-naphthalenyl-2-methylbutanoate. It was described first time in U.S. Pat. No. 4,231,938 and JP 8425599.\nItavastatin is chemically (S\u2014(R*,S*-(E)))-7-(2-cyclopropyl-4-(4-fluorophenyl)-3-quinolynyl)-3,5-dihydroxy-6-heptenoic acid. Pitavastatin is a lactone form of itavastatin. They were described first time in European patent no. 304063 and U.S. Pat. No. 5,011,930, respectively.\nMevastatin is chemically (3R,5R)-3,5-dihydroxy-7-((1S,2S,6S,8S,8aR)-2-methyl-8-((2S)-2-methylbutanoyl)oxy)-1,2,6,7,8"} +{"output_text": " a film in a direction transverse to the direction of corrugation. The ring-rolling process is carried out by passing the film through a nip formed between a pair of corrugating rolls, which are rotated in opposite directions. The corrugating rolls are provided with a plurality of corrugations, which are arranged in a pattern that is complementary to the pattern of corrugations in the film. The corrugating rolls are rotated at a speed that is sufficient to stretch the film in", "input_text": " in the art are methods for imparting extensibility to an otherwise substantially inelastic material, which may be employed as a backsheet. For example, the use of corrugating rolls to laterally or longitudinally stretch and to simultaneously provide a corrugated form to thin plastic films is disclosed in U.S. Pat. No. 4,116,892, entitled \u201cProcess for Stretching Incremental Portions of an Orientable Thermoplastic Substrate and Product Thereof,\u201d which issued on Sep. 26, 1978, to Eckhard C. A. Schwarz; U.S. Pat. No. 4,834,741, entitled \u201cDiaper With Waistband Elastic,\u201d which issued on May 30, 1989, to Reinhardt N. Sabee; U.S. Pat. No. 5,156,793, entitled \u201cMethod for Incrementally Stretching Zero Strain Stretch Laminate Sheet In A Non-Uniform manner To Impart A Varying Degree Of Elasticity Thereto,\u201d which issued on Oct. 20, 1992, to Kenneth B. Buell et al.; U.S. Pat. No. 5,167,897, entitled \u201cMethod for Incrementally Stretching A Zero Strain Stretch Laminate Sheet To Impart Elasticity Thereto,\u201d which issued on Dec. 1, 1992 to Gerald M. Webber et al.; and U.S. Pat. No. 5,422,172, entitled \u201cElastic Laminated Sheet of An Incrementally Stretched Nonwoven Fibrous Sheet and Elastomeric Film and Method,\u201d which issued on Jun. 6, 1995, to Pai-Chuan Wu. The corrugating rolls disclosed in each of those patents are employed in carrying out a process sometimes referred to as \u201cring-rolling,\u201d to locally stretch"} +{"output_text": "sinusoidal character of the alternating voltage, the motor torque is not constant, and the motor speed is not constant.\nThe above-mentioned disadvantages of the prior art are overcome by the present invention, and an object of the present invention is to provide a variable frequency and amplitude three-phase inverter which is capable of supplying a three-phase motor with a variable frequency and amplitude three-phase output signal, and which is capable of supplying a three-phase motor with a variable frequency and amplitude", "input_text": " to supply DC machines. The output voltage of the thyristor rectifiers increases at the moment of firing. As a result there is a pulsating alternating voltage superimposed on the direct voltage.\nThis alternating component of the supply voltage increases considerably the load of commutator DC machines. Thus, they must be made more robust (i.e. overdimensioned) in order to prevent damage due to these effects.\nThe above-mentioned advantages of AC motors cannot be utilized when continuous speed and power regulation is necessary.\nVariable three-phase output voltage and frequency from static frequency changers are already being used to drive electrical AC motors.\nA similar approach is described by Sandor Marti and Dr. Laszlo Nagy in \"The Optimization of I.F. Thyristor Frequency Changers to Supply Grinding-Wheel Motors\" (Villamossaq Vol. 21, No. 7, pp. 216-218, July 1973). According to this article, the variable frequency and amplitude of the three-phase output signal is produced in two stages. First, a variable direct voltage is generated by a rectifier controlled by the three-phase network, and then this direct voltage is converted into alternating voltage of the required frequency by using a three-phase inverter. The amplitude of the output voltage is controlled by the direct voltage fed to the inverter, and the frequency is controlled by changing the recurrence rate of the firing pulses supplied to its thyristors.\nThis approach is basically similar to the inverter of Brown Boveri and CIE (BBC) named VERITRON. Its description can be found in their catalogue D GHS 309320.\nOwing to the non-sinusoidal character of the alternating voltage produced by said inverters, the pace of electric motors driven thereby is rough, especially at low frequencies. As a result of the non-"} +{"output_text": ". However, this method requires a large number of slots, increasing the number of turns of the windings, and thus increasing the number of turns of the coil ends. As a result, the coil ends are increased in height, increasing the amount of solder or weld melt, and decreasing the quality of the finished product.\nIn addition, in the case of a three-phase motor, the number of turns of the coil ends is increased by a factor of three, and the number of turns", "input_text": " the end portions 54b, which have lost their insulation due to soldering or welding, the coil-end construction is easily corroded by exposure to moisture, making corrosion resistance extremely low.\nFurthermore, because the front-end coil end group is composed of two rows of ninety-six joint portions, i.e., 192 joint portions, the construction facilitates short-circuiting between the joint portions, increasing the likelihood of short-circuiting accidents.\nA large number of the short conductor segments 54 must be inserted into the stator core 51 and their end portions 54b must be joined by welding, soldering, etc., significantly reducing operability. Furthermore, the amount of each conductor segment 54 which is inserted into the slots 51a must be greater than the length of the stator core 51, facilitating damage to the insulation and reducing the quality of the finished product. In addition, when joining the end portions 54b, short-circuiting often occurs between the joint portions due to spilt solder or weld melt, significantly decreasing mass-producibility.\nThe end portions 54b of the conductor segments 54 are joined to each other by clamping a portion thereof in a jig, and soldering or welding the tips thereof. Thus, because clamping area is required for the jig and expansion of the soldered portions or welded portions occurs, the height of the coil ends is increased and space between the joint portions is reduced. As a result, coil leakage reactance in the coil end portions is increased, causing output to deteriorate, and wind resistance is increased, exacerbating wind noise.\nFurthermore, as a measure against magnetic noise, mutual cancellation of magnetic pulsation forces by winding two sets of three-phase windings into slots in positions offset by an electrical phase difference of 30xc2x0 has been proposed in Japanese Patent Laid-Open No. HEI 4-26345, for example"} +{"output_text": ".\nThe present invention provides a method for planarizing a surface of a substrate, such as a circuit board, which is not susceptible to the above-described problems associated with conventional chemical-mechanical polishing techniques. The present invention also provides a method for planarizing a surface of a substrate, such as a circuit board, which is compatible with organic-based substrates.\nThe present invention provides a method for planarizing a surface of a substrate, such as a circuit board, which is not susceptible", "input_text": "ist pattern, which typically is of photographic sharpness. Pattern plating thereby provides good control over circuit path width and permits conductive lines of relatively fine width. The circuit path height, however, is not as easily controlled because such height is dependent on the density of the desired conductive lines. As a result, isolated conductive lines are typically thicker than densely packed (closely spaced) conductive lines. Thus, line height is not precisely controlled by the acid plate process.\nThe additive (electro-less) plating process is similar to the acid plate pattern process, except that chemical plating processes are used rather than electro-plating processes. Additive plate fabrication generally requires more time to complete as compared to acid plate pattern fabrication but is not as susceptible to circuit path height variation according to line density. Additive plating does occasionally result in copper nodule formation, however.\nThe surfaces of pattern plated circuit boards need to be planarized prior to successfuil plating. Planarization methods such as surface machining remove non-planar regions of the board. Chemical mechanical polish, another often used method also employed in the semiconductor and ceramic industries, contains abrasive slurry materials which attack both resist and copper surfaces. Such polishing techniques are not compatible with many organic-based substrates, which are often used in conjunction with surface-mount technology circuit boards. Surface-mount technology is gaining in popularity because it permits higher component densities and faster component mounting as compared with more conventional wire-bonding techniques in which it is necessary to electrically interconnect several small contacts and conductor sites with fine, delicate wires. Such polishing techniques are generally incompatible with organic based substrates because such substrates are somewhat flexible and typically have surface undulations. The surface undulations are due to variations in substrate thickness and also to the inherent flexibility of the boards, which permits bowing and warping. Conventional chemical-mechanical polishing techniques will not follow these undulations and contours of flexible substrates"} +{"output_text": " Pat. No. 5,556,783) In contrast, androgenic alopecia (common baldness in women) is a pathologic condition that is associated with the loss of androgenic hormones. (U.S. Pat. No. 5,556,783) Androgenic alopecia is a progressive condition that is characterized by a thinning of the scalp hair and by a reduction in the diameter of the hair shaft. (U.S. Pat. No. 5,", "input_text": " of a new hair cycle.\nThe hair follicle is an epidermal appendage, the lower part of which undergoes cycles of growth and degeneration. (U.S. Pat. No. 5,556,783, incorporated by reference herein in its entirety) During the anagen (the growing phase) of the hair cycle, matrix keratinocytes located in the bulb region grow vigorously, generating cells that differentiate into several distinct hair components including the medulla, cortex and inner root sheath. During catagen, keratinocytes of the lower follicle below the bulge region (the attachment site of the arrector pili muscle) degenerate and the dermal papilla cells (DP; a group of specialized mesenchymal cells) aggregate and become encapsulated by a connective tissue sheath. Through the contraction of this sheath, the DP aggregate ascends and becomes attached to the bottom of the upper (permanent) portion of the follicle (telogen or the resting phase). Finally, a new epithelial growth originates from the bottom of the bulge area; this downgrowth pushes the DP away and reforms a growing bulb.\nThe in vitro growth potential of different subpopulations of follicular epithelial cells have been studied. (U.S. Pat. No. 5,556,783) Keratinocytes of different portions of human scalp follicles were isolated by microdissection followed by trypsinization and propagated in the presence of 3T3 feeder cells. The results indicate that the upper follicle contains keratinocytes that have in vitro proliferative potential that is significantly higher than those of the lower follicle, the bulb, the sebaceous gland and the epidermis.\nAlopecia (hair loss) is a common condition that results from diverse causes. For example, adrenergic alopecia (common baldness) is seen in the vast majority of adult males and is considered physiologic and part of the aging process. (U.S."} +{"output_text": "S. Pat. Nos. 4,976,148; 5,964,733; 6,039,733; 6,042,582; 6,050,976; 6,056,975; 6,073,913; 6,083,913; 6,090,976; 6,099,484; 6,099,485; 6,099,486; 6,099,487; 6,099,488;", "input_text": " indicated. Three representative thermal blankets known in the prior art are shown in FIGS. 1A-1D. A \u201cfull body\u201d thermal blanket 10 is shown in FIG. 1A. The full body thermal blanket is adapted to lie upon a person and to extend longitudinally along the body of the person in order to cover substantially the person's entire body, from near the ankles or feet up to the neck. A \u201clower body\u201d thermal blanket 12 is shown in FIG. 1B. The lower body thermal blanket 12 is adapted to lie upon the person and to extend longitudinally along the body of a person in order to cover the person's lower body, from near the ankles or feet up to the waist or pelvis of the person. An \u201cupper body\u201d thermal blanket 15 is illustrated in FIGS. 1C and 1D. The upper body thermal blanket 15 has a bow-tie shape that is adapted to lie upon and extend transversely across the upper body of a person in order to cover the person's chest and extended arms. A head drape 16 may be formed on or attached to the upper body thermal blanket 15 for draping over the head 17 of a person in order to retain warmed air expelled through the blanket 15 about the head to aid in therapeutic warming during surgery. When fed a stream of warmed pressurized air, each of the thermal blankets 10, 12, 15 inflates and distributes the air within itself. While the thermal blanket lies on the person, the warmed pressurized air flows through apertures or interstices in a permeable surface of the thermal blanket which faces the person. These thermal blankets may have one, two, or more inlet ports 18 through which an air hose 19 provides warmed pressurized air from a heater/blower unit (not shown in these drawings).\nThe construction of thermal blankets is well understood. Examples of specific constructions are given in U."} +{"output_text": "elled member. Therefore, the position of the self-propelled member cannot be detected accurately.\nIn order to solve the above-mentioned problems, a position sensor shown in FIG. 2 has been proposed. In the position sensor shown in FIG. 2, a plurality of position sensing lines 2a and 2b are arranged in a traveling field 1. The position sensing lines 2a and 2b are connected to a position detector 4. The position detector 4 detects the position of a self-prop", "input_text": " in FIG. 1. In the position sensor shown in FIG. 1, X-position sensing lines 2a and Y-position sensing lines 2b are densely provided within the traveling field 1. The X-position sensing lines 2a are connected to an X-position retriever 3a, and the Y-position sensing lines 2b are connected to a Y-position retriever 3b. In this way, a self-propelled member travels over the traveling field 1 within which the X-position sensing lines 2a and the Y-position sensing lines 2b are arranged. The self-propelled member emits a unique signal from its transmitter. The position sensing lines 2a and 2b receive the unique signal and send the thus-received signal to the X-axis and Y-position retrievers 3a and 3b. The received signal is further transmitted to a position detector 4, where the X-coordinate position and Y-coordinate position of the self-propelled member are detected by the position detector 4. The position detecting signal is transmitted to a microcomputer 5. Since the self-propelled member emits a unique signal at predetermined time intervals, the traveling position of the self-propelled member is detected every time the unique signal is emitted.\nIn the case of a game machine which senses the position of a self-propelled member through use of the position sensing lines 2a and 2b, since the position sensing lines 2a and 2b are arranged within the traveling field 1 densely, manufacturing costs of the game machine are expensive. Laborious operations are required for laying sensing lines within a traveling field. Further, there may arise a case where malfunction may arise for reasons of an open circuit or connection failures. In this case, a plurality of position sensing lines located in the vicinity of one self-propelled member receive signals output from the self-prop"} +{"output_text": " research that the magnetic mirror was the best way to confine a plasma. However, the magnetic mirror has a disadvantage in that it is difficult to confine a plasma in a toroidal configuration.\nThe toroidal type of confinement device has a toroidal magnetic field which is closed upon itself. The plasma is confined by the magnetic field lines which are closed upon themselves. The plasma is confined by the magnetic field lines which are closed upon themselves. The plasma is confined by the magnetic field lines which are", "input_text": " of the cells are programmed, the node N3 is set to \u201c1\u201d during phase F5. This concludes the procedure for the first page of the memory cell array.\nDuring the program procedure of the first page of the memory cell array, the data of the second page are simultaneously loaded to the node N4 in the LATCH 1. As a result, two procedures are carried out concurrently in a given page buffer.\nU.S. Pat. No. 6,031,760 entitled SEMICONDUCTOR MEMORY DEVICE AND METHOD OF PROGRAMMING THE SAME describes in connection with FIG. 5 thereof a prior art single-latch memory device that is typical of conventional memory devices. The described circuit has a single sense amplifier S/A that includes only a single latch circuit LT. This invention relates to the confinement of plasmas by magnetic fields and, more particularly, to an apparatus and method for the formation of a spheromak plasma.\nDevices employed for the containment of plasmas by magnetic fields may have various configurations. Two well-known types of such devices are the open-ended type, such as the magnetic mirror type, and the toroidal type, such as the tokamak. The underlying principle of all types of such containment devices is the containing of a hot, dense gas away from physical walls for a time sufficient to allow fusion reactions to take place.\nAn advantage of the mirror-type device is that it has have a coil-blanket topology which does not link the plasma. However, the mirror-type open ended apparatus has a disadvantage in that the trapped charge particles may escape while travelling along the magnetic field lines which define their spiral orbits. The magnetic field lines do not close upon themselves inside the magnetic mirror, thus compounding the problem of large plasma losses through the mirror ends. It occurred to many people in the early days of fusion"} +{"output_text": " issued on Aug. 11, 1998 to Edward J. Cone, et al., describes a system for managing an attention brokerage account. The system includes a database for storing information about the brokerage account, a processor for processing information in the database, and a display for displaying information in the database. The processor is programmed to receive information from the brokerage account, to process the information, and to display the processed information on the display. The processor is also programmed to receive information from the broker", "input_text": " to Netcentives, Inc. on Jun. 30, 1998. The '870 patent provides a system whereby the user can make purchase of products over the Internet and receive award points, which are stored in an associated database. The user can subsequently view an award catalog to determine which awards he may be able to redeem based on the number of points in his account. This patent does not teach, however, the ability of a user to trade-in his points accumulated in a pre-existing frequent flyer account in order to make purchases of products from the award catalog or allow the points to be pooled with other programs in order to gain further purchasing power.\nThe ClickRewards program site appears to operate in the same fashion as that described in the '870 patent; i.e. it allows users to gain points (called \u201cClickMiles\u201d) for making an online purchase of a product through an associated web site. For example, ClickMiles may be awarded for a purchase of Gap products at the Gap web site. The ClickMiles can ultimately be redeemed for frequent flyer miles, for example at one of several major airlines. Another web site, www.webflyer.com, is associated with ClickRewards and provides ClickMiles for purchasing frequent flyer-related goods, such as guidebooks.\nThe ClickMiles Reward Catalog allows the user to redeem the ClickMiles for merchandise in the alternative to frequent flyer miles. For example, a CD can be obtained from CDNow by redeeming 900 ClickMiles.\nAlthough the ClickRewards program allows a user to redeem accumulated points for obtaining merchandise over the Internet, it does not allow for the redemption of frequent flyer miles from a pre-existing account to be traded for reward points.\nU.S. Pat. No. 5,794,210, ATTENTION BROKERAGE,"} +{"output_text": " may be fitted with a retainer.\nTypically, brackets and retainers are bonded to a patient's teeth in the dental office using a dental adhesive. The adhesive is typically placed on the surface of the tooth and the bracket. Once the adhesive has set, the bracket is placed on the tooth and the adhesive is allowed to set. The adhesive is designed to hold the bracket in place on the tooth.\nThe process of bonding brackets to a patient's teeth is called \u201cbonding.\u201d The", "input_text": " provide a foot and toe protection device for an open toe style shoes and sandals that is secured on the foot within the shoe such that the foot and toe protection device is adjacent to the wearer's foot.\nStill a further objective of the present invention is to provide a foot and toe protection device operable to be secured on the foot within an open toe style shoes and sandals that is generally transparent.\nYet another objective of the present invention is to provide a foot and toe protection device that can be worn numerous times before becoming worn out and therefore, does not require consistent replacement.\nAn additional objective of the present invention is to provide a foot and toe protection device for an open toe style shoes and sandals that is adjacent to the pinky toe of a wearer wherein the foot and toe protection device functions to prevent the pinky toe from protruding and bulging from the shoe.\nA further objective of the present invention is to provide a foot and toe protection device for an open toe style shoes and sandals that is manufactured from a flexible material.\nTo the accomplishment of the above and related objectives the present invention may be embodied in the forms illustrated in the accompanying drawings. Attention is called to the fact that the drawings are illustrative only. Variations are contemplated as being a part of the present invention, limited only by the scope of the claims. Orthodontics is the practice of manipulating a patient's teeth to provide better function and appearance. In general, brackets are bonded to a patient's teeth and coupled together with an arched wire. The combination of the brackets and wire provide a force on the teeth causing them to move. Once the teeth have moved to a desired location and are held in a place for a certain period of time, the body adapts bone and tissue to maintain the teeth in the desired location. To further assist in retaining the teeth in the desired location, a patient"} +{"output_text": " of the time.\nThe present invention is directed to a method and apparatus for providing a user interface for a computer system. More particularly, the present invention is directed to a method and apparatus for providing a user interface for a computer system that is capable of displaying a plurality of windows on a display screen.\n2. Description of the Related Art\nA computer system typically includes a display screen for displaying information to a user. The display screen is typically a cathode ray tube (CRT) or a liquid", "input_text": "-select signal and an overlay signal, which together cause the channel selector 20 to recover both the encoded video signal of the selected channel and the encoded video signal containing the overlay frame or frames. The overlay signal causes the video decoder 22 to decode the recovered channel and overlay video signals from the channel selector 20 into respective sequences of frames, and causes the combiner 24 to blend the overlay frames with the channel frames to generate blended frames. The optional re-encoder 26 re-encodes these blended frames and provides them to the display 13, which decodes the re-encoded blended frames. If, however, the re-encoder 26 is omitted, then the combiner 24 provides the blended frames directly to the display 13.\nUnfortunately, the set-top box 11 cannot utilize the decoding ability of the display 13, and thus includes its own redundant decoding circuitry, which often adds significant size and cost to the box 11. Typically, the display 13 includes channel-select and full decoding circuitry respectively similar to the channel selector 20 and the decoder 22 of the box 11. Thus, the display 13 typically can directly receive the encoded, multiplexed broadcast video signal, recover the encoded video signal of the selected channel, and decode and display the video frames of the recovered video signal. But the display 13 typically cannot blend overlay frames with the video frames. Therefore, to allow such blending, the box 11 includes the same decoding capability (the decoder 22) as the display 13. The viewer, however, typically requests the display of overlay frames for only a small portion of the time that he/she spends watching a program. Therefore, because the blending abilities of the box 11 are needed only a small part of the time, the decoding abilities of the box 11 are redundant to those of the display 13 most of the time. That is, the viewer paid for two full decoders when one decoder will do the job the vast majority"} +{"output_text": " pests are known to feed on humans and animals, and many species of pests are vectors for pathogenic microorganisms which threaten human and animal health, including commercially important livestock, pets and other animals.\nIn the sandwich assay, the primary antibody B1 is immobilized on the surface of the sensor portion, and the labeling secondary antibody B2 is immobilized on the surface of the sensor portion. Accordingly, the distance from the surface of the fluorescent label f to the surface of the sensor portion is short, and", "input_text": " a sandwich assay is performed, fluorescence from a fluorescent label (fluorescent dye molecule f in this case) attached to labeling secondary antibody B2 is detected. The labeling secondary antibody B2 binds to the primary antibody B1 through antigen A. Excitation light is caused to enter the interface between the prism 101 and the gold film 102 at an angle greater than or equal to the total reflection angle to excite surface plasmons on the surface of the gold film 102. Accordingly, the electric field on the surface of the gold film 102 is enhanced. The fluorescent label (fluorescent dye molecule) f is excited in the enhanced electric field, and fluorescence is output. In FIG. 20, the graph shows distance-dependent characteristic of the strength (magnitude) of the electric field, the distance being measured from the surface of the sensor portion (surface of the gold film). As the graph shows, the strength of the electric field sharply decreases as the distance from the surface increases.\nAt this time, the maximum distance from the surface of the sensor portion to the fluorescent label f of the labeling secondary antibody is approximately 50 nm. When the distance from the surface of the sensor portion is approximately 50 nm, the intensity of fluorescence attenuates by 30% or more. Further, the primary antibody B1 is not always immobilized upright on the surface of the sensor portion, and the primary antibody B1 may fall along the surface by the flow of liquid, a three-dimensional obstacle or the like, and be immobilized in a lying or inclined state. Consequently, the distance from the surface of the fluorescent label f to the surface of the sensor portion is varied, and the intensity of the signal is varied. Many blood-ingesting pests are known to feed on humans and animals, and many pests are vectors for pathogenic microorganisms which threaten human and animal health, including commercially important livestock, pets and other animals. Various species of"} +{"output_text": ", the outputting MOS transistor 118b cannot be driven to a strong ON state, and hence the outputting MOS transistor 118b cannot be driven to a strong ON state.\nFIG. 28 is a diagram schematically showing a configuration of internal high voltage generating circuit 120 shown in FIG. 27. In FIG. 28, internal high voltage generating circuit 120 includes a ring oscillator 120a oscillating at a prescribed cycle, and a charge pumping circuit 120b generating internal high voltage Vpp by a charge pumping", "input_text": "0-Qn increases to 16 bits or to 32 bits, for example, current dissipation at the level conversion circuits included in output buffers 128-0 to 128-n at data reading becomes inneligible.\nFIG. 27 is a diagram schematically showing a configuration of internal high voltage generating circuit 120 shown in FIGS. 22 and 26. In FIG. 27, internal high voltage generating circuit 120 includes a ring oscillator 120a oscillating at a prescribed cycle, and a charge pumping circuit 120b generating internal high voltage Vpp by a charge pumping operation of a capacitor according to an output signal of ring oscillator 120a. To increase charge supplying capability of internal high voltage generating circuit 120, it is required to increase an oscillating frequency f of ring oscillator 120a and a capacitance value C of a charge pumping capacitor included in charge pumping circuit 120b. The higher the oscillating frequency of ring oscillator 120a is set, the larger a current consumed by switching operation at ring oscillator 120a becomes. In addition, the increase in the capacitance value of the capacitor included in charge pumping circuit 120b leads to increase in capacitor occupation area, and hence in circuit occupation area.\nSince design resources for normal standard DRAMs (Dynamic Random Access Memories) are inherited in configuring internal high voltage generating circuit 120, circuit configuration and layout with an osciliating frequency of ring oscillator 120a and a capacitance value of the charge pumping capacitor of charge pumping circuit 120b both optimized for a standard DRAM are employed. Therefore, when a large number of output buffers 128-0 to 128-n are operated in parallel in synchronization with high-speed clock signal ext.CLK, the charge supplying capability is insufficient, and the voltage level of internal high voltage Vpp decreases, making it impossible to drive outputting MOS transistor 118b shown in FIG. 24 to a strong ON state. In this case, even if the threshold voltage loss does not occur"} +{"output_text": " or glass material.\nIn the conventional art, the carbon nanotube is not used as a material for an electrode, and therefore, it is necessary to form the carbon nanotube into a particular shape in compliance with the shape for application.\nFurther, in the conventional art, the carbon nanotube is not used as a material for an electrode, and therefore, it is necessary to form the carbon nanotube into a particular shape in compliance with the shape for application.\nFurther, in the conventional", "input_text": "ube prepared by the arc discharge process was an SWNT having a diameter of about 5 nm at the leading end. Because of a thin tip and flexibility, even the bottom of a gap of a sample could be observed, and there was available an ideal tip free from a tip crash.\n(3) Hydrogen Storing Material\nA. C. Dillon et al. report, in Nature (vol. 386, 1997, p. 377-379), that the use of an SWNT permits storage of hydrogen molecules of a quantity several times as large as that available with a carbon generated from a pitch-based raw material. While their study on application has just begun, it is expected to serve as a hydrogen storing material for a hydrogen car or the like.\nIn the configuration and manufacturing method of a carbon nanotube in the conventional art, diameters and directions of resultant carbon nanotubes are very random, and after growth, an electrode is not connected to the carbon nanotube. More specifically, upon application of the carbon nanotube, it is necessary to collect after synthesis for purifying, and form it into a particular shape in compliance with the shape for application.\nFor example, when it is to be used as an election source, A. G. Rinzler et al. teaches the necessity to take out a carbon fiber and to bond an end thereof to an electrode, as reported in Science (vol. 269, 1995, p. 1550-1553).\nFurther, as reported in Science (vol. 270, 1995, p. 1179-1180) and Science (vol. 1, 268, 1995, p. 845-847), Walt A. de Heer et al. discloses the necessity to provide a step of purifying a carbon nanotube prepared by the arc discharge process, and then placing upright the carbon nanotube on a support by the use of a ceramic"} +{"output_text": "018587 discloses a method of treating a subject with Crohn's disease comprising administering to the subject an effective amount of a compound of formula (I):\nwherein: R1 is hydrogen, C1-6alkyl, C1-6alkylcarbonyl, C1-6alkyloxycarbonyl, C1-6alkylaminocarbonyl, C1-6alkylaminocarbonylmethyl, C1-6alkylaminocarbonylaminocarbonyl, C1-", "input_text": "flora. These agents therefore have to be administered by intravenous infusion or subcutaneous injection which requires specialist training in order to use a hypodermic syringe or needle correctly and safely. These agents also require sterile equipment, a liquid formulation of the therapeutic polypeptide, vial packing of said polypeptide in a sterile and stable form and a suitable site on the subject for entry of the needle. Subjects commonly experience psychological stress before receiving an injection and pain while receiving an injection. Long term treatment with these systemic anti-TNF-alpha antibodies carries increased risks of serious infection and cancer. Together with the high costs of production, these factors currently restrict use of these agents to patients with more severe disease.\nSeveral small molecule anti-inflammatory and immunosuppressive drugs are also currently in clinical development for Crohn's disease (Danese 2012 Gut 61:918-932 and Shealy et al 2010 mAbs 2:428-439, herein incorporated by reference in its entirety). Although these drugs are orally administered, many will be absorbed systemically after administration and may therefore have systemic immunosuppressive actions that are unrelated to actions against the gastrointestinal tract lesions. Furthermore, as small molecules lack the specificity of antibodies the risk of significant off target side-effects remains high.\nCrohn's disease is primarily a disease of the gastrointestinal tract. The production of TNF-alpha is localised to cells present within mucosal and sub-mucosal tissues and this drives chronic inflammatory processes within the gut wall and the recruitment of additional inflammatory cells that are responsible for development of the disease immunopathology (van Deventer 1999 Ann Rheum Dis 58(Suppl I):I114-I120). The ability to deliver an oral therapeutic agent with high selectivity for TNF-alpha, but with exposure and activity limited to the gut, may offer efficacy similar to injectable anti-TNF-alpha antibodies, combined with significant improvements in safety due to reduced systemic exposure.\nWO 2004/"} +{"output_text": " are described in JP-A-11/305032.\nThe compounds of the formula (I) are prepared by reacting a compound of the formula (IX) \nwherein R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12, R13, R14, R15, R16, R17, R18, R19, R20, R21", "input_text": " S and forming part of the radical Axe2x80x2 and R17 being any desired tertiary group. As particularly customary R17 radicals there may be mentioned by way of example tert-butyl, tert-amyl, 2-methyl-3-buten-2-yl, 2-methyl-3-butyn-2-yl, 4-oxa-2-pentyl or 4,7-dioxa-1-methyl-2-octyl, this being just a small selection of known radicals which are cited in the above-cited publications, which are hereby expressly incorporated herein for further examples.\nVery particular preference is given to compounds of the formula (VII), whose basic structure Axe2x80x2(H)xxe2x80x2 (VIII) is known to lead to synergistic effects with the basic structure A(H)x (VI) of the compound of the formula (I), the known effect being much intensified when the compounds of the formula (I) have higher decomposition temperatures than the compounds of the formula (VII). This is very surprising because one would expect on the contrary that the effect would be at its most intense when the compounds of the formula (I) and (VII) thermally decompose to the pigment at one and the same time (see JP-A-11/305032).\nAn example of this is the use of mixtures of two 1,4-diketo-3,6-diarylpyrrolo[3,4-c]pyrroles of the formulae (I) and (VII) where aryl is phenyl, chlorophenyl, dichlorophenyl, tolyl, p-cyanophenyl, tert-butylphenyl or biphenyl in the formula (I) and m-cyanophenyl in the formula (VII). Corresponding compositions"} +{"output_text": " are sometimes used to separate the oil droplets from the gas bubbles, the oil droplets tend to coalesce with the gas bubbles, and the coalesced droplets tend to rise at a terminal velocity that is even slower than the terminal velocity of the individual droplets.\nThe low terminal velocity of the oil droplets in a skim tank is problematic for a number of reasons. First, the low terminal velocity of the oil droplets in a skim tank means that the oil droplets tend to rise at a relatively slow rate, which", "input_text": "Vt=gd2\u0394\u03c1/18\u03bc where: Vt=terminal velocity of the droplet d=diameter of the droplet g=gravitational acceleration, \u0394\u03c1=difference in density between the surrounding fluid and the oil droplet, and \u03bc=fluid viscosity. \nStokes' law is valid when the fluid that the droplets are rising through is characterized by laminar flow. More specifically, the Reynolds Number, which is a ratio between the inertial and viscous forces within a fluid and may be used to determine whether fluid flow is laminar or turbulent, should have a value less than 1:Re=\u03c1fVtd/\u03bc<1 where: Re=Reynolds Number, and \u03c1f=the fluid density. \nSince the velocity of the droplet depends on the diameter as well, having a small Reynolds Number requires that the diameter of the oil droplet be relatively small for Stokes' law to be valid.\nThe terminal velocity of small oil droplets rising in a skim tank tends to be very slow, on the order of centimeters per second. Moreover, since the terminal velocity of a rising oil droplet is proportional to the square of its diameter, this means that smaller oil droplets tend to rise at even slower terminal velocities than larger droplets.\nIn view of the low terminal velocities of oil droplets in a skim tank, SAGD skim tanks are typically large, often exceeding 45\u2032 in diameter and 50\u2032 in height, in order to reduce the mean fluid velocity in the tank for a given inflow rate and create a quiescent environment to allow small oil droplets to separate. To further reduce fluid velocity in the skim tank, conical diffusers are sometimes employed at the fluid inlets, to cause the oil-containing liquid to expand into a wider cross-sectional area and thereby decelerate to a lower velocity immediately before entering the tank. Although barriers"} +{"output_text": " boot-up ROM with a malicious one. The malicious ROM may then be able to perform the same attack as the original ROM.\nA fourth problem is the computer's security. The computer may be attacked by a virus or other malicious software. The virus may be able to attack the operating system and software, and then attack the secure hardware module. For example, the virus may be able to attack the operating system and software, and then attack the secure hardware module.\nA fifth problem is the", "input_text": " critical root of trust software that performs the measuring of the computer operating system and software is stored on a boot-up ROM. The ROM is outside of TPM chipset and located on the computer's mother board because the ROM is application dependent instead of TPM dependent. This system has improved security since TPM is only accessible by measured operating system and measured software. However, this system still has some problems as described below.\nA first problem is the weakness in the interfacing between the computer and the secure hardware module. This problem is a result of the computer still having control over the secure hardware module. The computer is an open system that makes it easy for attackers to understand and then simulate what the secure software does in order to access the critical secrets stored in the secure hardware module.\nA second problem is the computer software security. The software controls access to the secure hardware module. Breaking the software may also break the secure hardware module. As operating system and other software become more and more complicated, their many requirements and performance goals began to contradict each other. Further, it is becoming more and more difficult to keep the software bug-free due to its increasing size and complexity. An example is the case of Microsoft's new operation system Vista, which showed security weakness after only several months after it is released to market. Hence, although the operating system and software could be improved, it is difficult to make them completely free of bugs and other security weaknesses.\nA third problem is a hardware attack. The movie pirate may, for instance, be a person who can physically open up the computer to attack the secure hardware module. They may be able to open the secure hardware module and probe the internal bus, and then perform reverse-engineering on the module. For example, in the TPM system, an attacker may not even need to open up the TPM chipset. Instead, they only need to replace the"} +{"output_text": " one mole of 2-ethylhexyl acrylate is shown below: \nThe prepolymer is then reacted with a polyfunctional isocyanate to form the polyurethane dispersion. The polyfunctional isocyanate is preferably a polyisocyanate having at least two isocyanate groups per molecule. Examples of suitable polyisocyanates include, but are not limited to, toluene diisocyanate, hexamethylene diisocyanate, isophorone diisocyanate, and the like. The poly", "input_text": " or mixtures thereof.\nThe reaction in which hydroxyl groups are reacted with isocyanate groups and polyurethane prepolymer is produced is usually performed at 50-100xc2x0 C. for 1-5 hours under an inert atmosphere such as nitrogen gas and at atmospheric pressure. Preferably the reaction is performed at 70-90xc2x0 C. for 2-3 hours.\nThe ratio of isocyanate to carboxyl-containing polyol is such as to have the desired amount of grafted carboxyl groups per molecule of polyurethane prepolymer. Usually, the carboxyl-containing monomer is added to result in an acid number for the prepolymer of 10-30 mg KOH/g. The preferred procedure for producing the prepolymer is to react the selected polyisocyanate with regular polyether or polyester polyol for 1-2 hours at 80-90xc2x0 C., and then add carboxyl-containing monomers until the theoretical isocyanate group content has been reached. If desired, catalysts such as dibutyltin dilaurate, stannous octoate, or amine-type catalysts like triethylamine or triethylene diamine, may be used to assist prepolymer formation. The prepolymer composition may also include solvents such as methylethylketone, methylpyrrolidone, and the like\nBecause carboxyl groups are grafted to the polyol molecule, the resulting main polyurethane chain is linear with carboxyl groups as side pendants. This structure is ideal for obtaining good water-borne dispersions. The chemical structure of an exemplary prepolymer made from 1 mole of 1000 molecular weight propylene oxide based diol (Poly-G 20-112 from Arch Chemicals, Inc., Norwalk, Conn.), three moles of 4,4xe2x80x2dicyclohexylmethane diisocyanate, and"} +{"output_text": "; and third from an indirect activation by the non-visible radiation output by the non-visible-light emitting particles and the direct activation by the electron beam.\nAccording to another exemplary embodiment, the visible-light emitting phosphor is a blue-emitting phosphor. The non-visible-light emitting particles are a green-emitting phosphor. The non-visible-light emitting particles are a red-emitting phosphor. The visible-light emitting phosphor is a blue-emitting phosphor. The non-visible-", "input_text": "citing visible-light-emitting phosphor. This two-stage excitation was designed to improve brightness in low-voltage displays and is not sufficient to adequately improve the image brightness in high-voltage displays, such as CRT displays.\nThe present invention is therefore directed to the problem of increasing image brightness in a cathode ray tube.\nThe present invention solves these and other problems by providing an additional phosphor-excitation mechanism to improve the light output of the visible-light emitting phosphor. The present invention provides the ability to improve the light output of not only a four-phosphor arrangement, but also for the traditional three-phosphor display or even a two-color or monochrome CRT, such as a black-and-white (black-and-green, black-and-amber, etc.) CRT. In addition, the present invention can improve the light output for CRTs employing more than four colors.\nAccording to one exemplary embodiment of the present invention, one excitation mechanism indirectly excites the visible-light emitting phosphor by first striking a non-visible-light emitting particle (such as an ultraviolet-emitting phosphor) with an electron beam, which then emits non-visible radiation that strikes a visible-light emitting phosphor, thereby activating the visible-light emitting phosphor. A second mechanism simultaneously excites the visible-light emitting phosphor by directly striking the visible-light emitting phosphor with an electron beam. Thus, the same visible-light emitting phosphor is activated by the first indirect mechanism as well as the second direct mechanism.\nAccording to another exemplary embodiment, image brightness can be optimized by disposing non-visible-light emitting particles behind and next to the visible-light emitting phosphors. The result is three different excitation modesxe2x80x94first from a direct activation by the electron beam; second from an indirect activation by non-visible radiation output by the non-visible-light emitting particles"} +{"output_text": " a plurality of optical transmitters 212, which transmit optical carrier signals to PON extender 204. PON extender 204 includes a plurality of optical receivers 214, which receive optical carrier signals from PON extender 204. PON extender 204 includes a plurality of optical transmitters 216, which transmit optical carrier signals to OHE 202. PON extender 204 includes a plurality of optical receivers 218, which receive optical carrier signals from OHE 202. PON extender 204 includes a plurality", "input_text": " from PON trunk fiber 110 into the different fixed wavelengths, which are then carried between splitter 104 and ONUs 106 by individual short fibers 112.\nConventional architectures like system 100, however, presently experience several drawbacks. Most OHEs, for example, have fewer PON trunk fibers available to the splitter, or node, than are required for the increasing number of subscribers. Additionally, many modern cable operators utilize a Data Over Cable Service Interface Specification (DOCSIS) infrastructure that may potentially transmit as far as 100 miles, which is considerably farther than distances supported by conventional PON technologies, which are typically limited to 20 kilometers (km). Therefore, a conventional PON extension system has been utilized to extend the transmission range of PON networks up to these increasing ranges required by a cable operator.\nFIG. 2 is a schematic illustration of a conventional PON extension system 200 for deploying a PON over distances greater than 20 km. System 200 includes an OHE 202, a PON extender 204, and a plurality of ONUs 206, which may be in communication with a plurality of respective customer premises (not shown). ONUs 206 transmit and receive optical carrier signals to/from PON extender 204 by short fibers/nodes 208, and PON extender 204 connects with OHE 202 through trunk fiber 210. Short fibers/nodes 208 recover PON signal streams from PON extender 204 and transmit the recovered signals to ONUs 206 using standard PON optics. Respective nodes of short fibers/nodes 208 may also function as splitters. ONUs 206 will include 32-64 ONUs per group, and will have a symmetric architecture (e.g., ONU 206(1), 10/10G-EPON), or an asymmetric architecture (e.g., ONU 206(1)\u2032, 10/1G-EPON).\nOHE 202 includes"} +{"output_text": ") and cellular technologies.\nGNSS is a satellite-based positioning system that uses signals from a number of satellites to determine a location of a receiver. GNSS is widely used for navigation and positioning, but it is not suitable for indoor positioning.\nCellular technologies are based on the triangulation of a mobile device's location using the cell towers that the mobile device is connected to. Cellular technologies are widely used for indoor positioning, but they are not suitable for outdoor positioning.\nIn order to", "input_text": " rotation of the swing arm, which tends to return the swing arm to its initial position. This restoring moment is set up whenever the cyclist pedals harder, thereby avoiding the onset of the bobbing effect. Obviously, when overcoming an obstacle, the restoring moment adds itself to the restoring force of the shock absorber, considerably improving motive power when climbing hills for example. Also, unlike the teaching of French patent FR 2.774.966 in which the instantaneous centre of rotation of the swing arm is located in the upper anterior quadrant of the bicycle, the position of the instantaneous centre of rotation is located in the upper posterior quadrant and the position of the shock absorber on the down tube of the bicycle imparts greater rigidity to the underframe/suspension assembly making it possible to clear the space between the rear wheel and the seat so as to fix a rear mudguard, carrier rack or similar.\nAccording to one particularly advantageous characteristic of the rear suspension of the invention, the instantaneous centre of rotation of the swing arm moves globally along a straight line perpendicular to the upper strand of the chain when the hub axle of the drive wheel moves, so that the restoring moment is proportional to the movement of the hub axle of the drive wheel. The present invention relates to positioning technology, in particular, hybrid positioning with blending multiple location technologies.\nLocation based services are an emerging area of mobile applications that leverages the ability of new devices to calculate their current geographic position and report that to a user or to a service. Some examples of these services include identifying a location of a person or an object in the context of entertainment, work, health or personal life.\nLocation based services require instantaneous reliable positioning system that can work everywhere. Since no single positioning technology can meet such requirement, combining different positioning technologies to compensate for individual technology's own limitation can provide better results. Such combinations typically include Global Navigation Satellite System (GNSS"} +{"output_text": " addresses this need.\nThe present invention relates to a method for producing a semiconductor device, and more particularly, to a method for producing a semiconductor device having a trench isolation structure.\nIn general, a semiconductor device is manufactured by forming a plurality of active regions on a semiconductor substrate, forming a gate insulating layer and a gate electrode on the active regions, forming a source/drain region in the active regions by ion implantation, forming a sidewall spacer on the sidewalls of the gate electrode,", "input_text": ", at which point a plasma is struck with an rf power source 108 and creates plasma effluent 104 from discharge orifice 117 at one end of cavity 110 to clean substrate 102. In general, the shape and diameter of discharge orifice 117 may affect the corresponding shape of plasma effluent 104 along both the lateral and longitudinal axis (e.g., laterally narrow and longitudinally deep, laterally wide and longitudinally shallow, etc.). However, as previously stated, a large volume of inert gas may be required to prevent the generation of arc 105 between powered electrode 106 to grounded electrode 112.\nReferring now to FIG. 2, a simplified diagram of an atmospheric plasma jet device, in which a powered electrode is configured as a center rod and a grounded electrode(s) is configured on a cavity inner surface, is shown. As before, generally, an inert gas 118 (e.g., He, etc.) and a process gas 116 (e.g., CF4, H2, etc.) are flowed into sealed box 114 for pressurizing. The gases are, in turn, feed into a discharge chamber cavity 110 through gas influent 115, at which point a plasma may be struck with an rf power source 108 and creates plasma effluent 104 from discharge orifice 117 at one end of cavity 110 to clean substrate 102. In general, the shape and diameter of discharge orifice 117 may affect the corresponding shape of plasma effluent 104 along both the lateral and longitudinal axis (e.g., laterally narrow and longitudinally deep, laterally wide and longitudinally shallow, etc.). However, as previously stated, a large volume of inert gas may be required to prevent the generation of arc 105 between powered electrode 106 to grounded electrode 112.\nIn view of the foregoing, there are desired apparatus for the removal of a metal oxide from a substrate and methods therefor. The present invention"} +{"output_text": " (VoATM), and Voice over Frame Relay (VoFR) are cost-effective alternatives in this changing market. However, to make migration to these technologies possible, the industry has to ensure quality of service (QoS) for voice and determine how to charge for voice transfer over data lines. The Telecommunications Deregulation Act of 1996 further complicates this environment. This legislation will reinforce a symbiotic relationship between the voice protocol of choice, ATM, and the data protocol of choice, IP", "input_text": "\nIn today\"\"s networked world, bandwidth is a critical resource. Increasing network traffic, driven by the Internet and other emerging applications, is straining the capacity of network infrastructures. To keep pace, organizations are looking for better technologies and methodologies to support and manage traffic growth and the convergence of voice with data.\nToday\"\"s dramatic increase in network traffic can be attributed to the popularity of the Internet, a growing need for remote access to information, and emerging applications. The Internet alone, with its explosive growth in e-commerce, has placed a sometimes insupportable load on network backbones. It is also the single most important cause of increased data traffic volumes that exceed voice traffic for the first time. The growing demands of remote access applications, including e-mail, database access, and file transfer, are further straining networks.\nThe convergence of voice and data will play a large role in defining tomorrow\"\"s network environment. Currently, the transmission of data over Internet protocol (IP) networks is free. Because voice communications will naturally follow the path of lowest cost, voice will inevitably converge with data. Technologies such as Voice over IP (VoIP), Voice over ATM (VoATM), and Voice over Frame Relay (VoFR) are cost-effective alternatives in this changing market. However, to make migration to these technologies possible, the industry has to ensure quality of service (QoS) for voice and determine how to charge for voice transfer over data lines. The Telecommunications Deregulation Act of 1996 further complicates this environment. This legislation will reinforce a symbiotic relationship between the voice protocol of choice, ATM, and the data protocol of choice, IP.\nIntegrating legacy systems is also a crucial concern for organizations as new products and capabilities become available of lowest cost, voice will inevitably converge with data. Technologies such as Voice over IP (VoIP), Voice over ATM"} +{"output_text": " equipment shall transmit a CSI report on the Physical Uplink Control Channel (PUCCH) within a certain time interval. The time interval is configurable by the eNodeB. The CSI report is transmitted on the PUCCH using the resources allocated for the transmission of the CSI report.\nThe CSI report is transmitted on the PUCCH using the resources allocated for the transmission of the CSI report. The CSI report is transmitted in a resource block pair (RB", "input_text": " as outlined earlier, while the term set S represents generally a subset of the whole set of resource blocks in the system bandwidth. In the context of 3GPP LTE and LTE-A, the set S so far is defined to always represent the whole cell, i.e., component carrier bandwidth, a frequency range of up to 20 MHz, and is for simplicity hereafter referred to as \u201cwideband\u201d.\nAperiodic & Periodic CQI Reporting\nThe periodicity and frequency resolution to be used by a UE to report the CSI are both controlled by the eNodeB. The Physical Uplink Control Channel (PUCCH) is used for periodic CSI reporting only; the PUSCH is used for aperiodic reporting of the CSI, whereby the eNodeB specifically instructs the UE to send an individual CSI report embedded into a resource which is scheduled for uplink data transmission.\nIn order to acquire CSI information quickly, eNodeB can schedule aperiodic CSI by setting a CSI request bit in an uplink resource grant sent on the Physical Downlink Control Channel.\nIn 3GPP LTE, a simple mechanism is foreseen to trigger the so-called aperiodic channel quality feedback from the user equipment. An eNodeB in the radio access network sends an L1/L2 control signal to the user equipment to request the transmission of the so-called aperiodic CSI report (see 3GPP TS 36.212, section 5.3.3.1.1 and 3GPP TS 36.213, section 7.2.1 for details). Another possibility to trigger the provision of aperiodic channel quality feedback by the user equipments is linked to the random access procedure (see 3GPP TS 36.213, section 6.2).\nWhenever a trigger for providing channel quality feedback is received by the user equipment, the user"} +{"output_text": " plurality of pixel electrodes formed on the insulative substrate; a plurality of thin film transistors formed on the insulative substrate and each having a gate electrode, a semiconductor layer, a source electrode and a drain electrode; a plurality of storage capacitance electrodes formed on the insulative substrate and each connected to a corresponding one of the plurality of pixel electrodes; and a plurality of gate lines and a plurality of data lines formed on the insulative substrate and each connected to a corresponding one of the plurality of thin film transistors", "input_text": " junction between the source electrode 243 and the wiring 253. This structure involves only a single conductive through hole (the conductive through hole 236), and thereby achieves an increase in aperture ratio.\nNevertheless, the liquid crystal display of single through hole type described above was hardly applicable due to the following problems. That is, in this type, the wiring 253 and the storage capacitance electrode 251 consisting of a transparent conductive film, typically of ITO, were formed between the gate insulating film 202 and the upper insulating film 203. This meant an additional patterning as compared with other conventional types. Then, the ITO patterning used aqua regia, whereas the drain electrode 242 and the source electrode 243 in the TFT section were formed of n+ type amorphous silicon films which dissolve in aqua regia. Therefore, an additional step for protecting the electrodes was required. Moreover, because the patterning of transparent conductive films is rather poor in working precision as compared with metal films, the storage capacitance electrode 251 patterned out of a transparent conductive film has more capacitance variations or defects. Accordingly, the pixels varied in image stability, which produced unevenness in the entire screen view. Furthermore, the wiring 253 was directly connected to the source electrode 243 made of an n+ type amorphous silicon film, whereas the connecting interface between an ITO film and an n+ type amorphous silicon film is high in contact resistance. This made a capacitance-charging time delay not-negligibly large, thereby hampering sufficient charging.\nAn object of the present invention is to provide a liquid crystal display, particularly of high resolution, which has an improved aperture ratio and high reliability as well as is simple in structure and capable of low-cost and high-yield fabrication, and which can neglect the capacitance-charging time delay, and a method of fabricating the same.\nA liquid crystal display according to the present invention comprises: an insulative substrate; a"} +{"output_text": " cancer and to improve the accuracy of the staging procedures.\nThe RIGS system has been used in the detection and location of occult tumor in the following clinical studies:\n1. A study of the use of the RIGS system in the detection of occult tumor in the liver, including the use of the system in the detection of occult tumor in the liver in patients with colorectal cancer.\n2. A study of the use of the RIGS system in the detection of occult tumor in", "input_text": " initial mucosal growth, a tumor may progress locally in several directions, but usually it protrudes first into the lumen. Mural penetration may result in local failure or peritoneal seeding.\nColorectal cancer first metastasizes to the perirectal nodes at the level of the primary tumor or immediately above it. Next, the chain accompanying the superior hemorrhoidal vessels is involved. In later stages of disease, when the hemorrhoidal lymphatics are blocked, there is lateral downward spread. In colon carcinoma, normal lymphatic flow is through the lymphatic channels along the major arteries, with three echelons of lymph nodes: pericolic, intermediate, and principal. If tumors lie between two major vascular pedicles, lymphatic flow may drain in either or both directions. If the central lymph nodes are blocked by tumor, lymphatic flow can become retrograde along the marginal arcades proximally and distally. The risk for lymph node metastases increases with increasing tumor grade, as does the number of lymph nodes affected.\nThe liver is the primary site of hematogenous metastases, followed by the lung. Involvement of other sites in the absence of liver or lung involvement is rare.\nImplantation refers to the release of tumor cells from the primary tumor and their deposition on another surface. Implantation has been reported with tumor cells shed intraluminally, from the serosal surface through the peritoneum, and by surgical manipulation and resultant deposition on wound surfaces.\nThe contribution of RIGS-based surgery to enhancing the vision-based and touch-based procedures of the surgeon has been substantial. The detection and location approach of this system has permitted the identification and removal of hidden or occult tumor under conditions where otherwise conventional procedures would not have found it. Additionally, the system has been employed in staging, particularly in evaluating lymph nodes and other metastatic disease for staging procedures. The system has been demonstrated in clinical studies to substantially improve the staging of"} +{"output_text": "channel-aware application is not changed by the non-alpha-channel-aware application, the alpha value of such pixel data is set to zero.\nThe result is that the alpha value of the pixels in the image drawn by the non-alpha-channel-aware application is set to zero. The alpha value of the pixels in the image drawn by the non-alpha-channel-aware application is not changed by the non-alpha-channel-aware application. The alpha value of the pixels", "input_text": "\nNon-Alpha-Channel-Aware Applications\nAlpha blending and the use of the alpha value are not known by many older applications (non-alpha-channel-aware applications) that create pixel data files. In the usual case, a non-alpha-channel-aware application takes as input a pixel data file to be written to, and, in that pixel data file, overwrites the pixel data value for any number of pixels. In this way, a new image is written into the pixel data file. Any image information which had previously existed in the input image data file is overwritten in each pixel that has been written to by the non-alpha-channel-aware application. The output of the non-alpha-channel-aware application is the partially or totally rewritten pixel data file.\nImportantly, when new pixel information is written in to the pixel data file, many non-alpha-channel-aware applications set the last byte of the pixel data value to zero for each pixel that is overwritten. This byte is the byte used as the alpha value in alpha-channel-aware-applications.\nUsing A Non-Alpha-Channel-Aware Application in an Alpha-Channel-Aware Environment\nWhen a non-alpha-channel-aware application generates a pixel data file for use with an alpha-channel-aware environment, an issue arises in that the alpha value for each pixel in the image drawn by the non-alpha-channel-aware application is set to zero. Pixel data not created by the non-alpha-channel-aware application is not changed by the non-alpha-channel-aware application, so the original alpha value of such pixel data (which may be greater than zero) is retained. These pixels may appear if the pixel data file is used in an alpha-channel-aware environment. But because pixel data created by the non-alpha-"} +{"output_text": " of the two half-frames to the horizontal position.\nThe above skate, however, does not allow the shoe to be articulated in a manner which is as natural as possible, because the elastic element is inserted in the region of the hinge, and therefore the shoe is not able to follow the natural movement of the foot.\nEP-0 774 282 discloses a skate with in-line wheels which comprises a shoe associated a wheel supporting frame which has, approximately in a median region, a cut", "input_text": " to allow the articulation of said shoe.\nA similar configuration is provided if the shoe is associated with a frame for supporting in-line wheels.\nThe above skate follows the natural movement of the foot, but at the same time it does not ensure the adequate technical characteristics required for sports practice, because, for example, the seat for the frame or blade provided in the rear region of the shoe is formed with plays which allow it to perform a combined rotary and translatory motion. This in no way ensures adequate lateral support and torsional strength, particularly during side-slip braking, which is notoriously the most trying step for the rear region of the skate.\nEP-0 774 282 discloses a skate with in-line wheels which comprises a shoe associated a wheel supporting frame which has, approximately in a median region, a cutout which divides the frame into two components and forms a deformable region which allows the articulation of the two components of the frame. An elastic element is interposed between said components and is adapted to improve their return to the horizontal position.\nIn the above skate, it is noted that the deformation region, and therefore the corresponding flexing region of the shoe, do not meet the anatomical requirements of the foot because it is observed that the position of the rotation axis is distinctly too far back with respect to the natural axis of rotation of the metatarsus.\nWO-97/18019 discloses a skate which is again composed of a shoe associated with a wheel supporting frame which is composed of two half-frames frames which are mutually rotatably connected by means of a common mechanical hinge which is located approximately in the vicinity of the axis of the second wheel starting from the front end of the shoe.\nAn elastic element is inserted approximately in the region located directly above the hinge, and its ends rest at the two half-frames, so as to facilitate the return"} +{"output_text": " to synchronize with the base station. The ARC is a common channel for all mobiles in a cell, and it is used to synchronize the mobiles to the base station. The ARC is a common channel for all mobiles in a cell, and it is used to synchronize the mobiles to the base station. The ARC is a common channel for all mobiles in a cell, and it is used to synchronize the mobiles to the base station. The AR", "input_text": " of silence. One possible solution to the link maintenance problem when bursty transmission is required on reverse links of CDMA systems is to assign a separate low bit rate physical control channel (created using code division multiplex) to each portable in a given cell. This approach, proposed by some investigators [2], can be called a continuous transmission medium access control scheme (CTX-MAC), and it unfortunately causes increased multi-user interference introduced by continuously transmitted maintenance signals. Increased interference naturally leads to reduced traffic capacity. Also its implementation is characterized by significantly increased complexity of base station receivers, where each physical control channel (serving one portable) would require a separate despreading correlator. If the duty cycle (ratio of the on-period to off-period of the mobile\"\"s transmitter) as shown in FIG. 1 is small, then considerable savings in both base station hardware and system capacity may be achieved if transmission is discontinued during the off-periods and the hardware is shared among different users. This is the discontinuous transmission medium access control (DTX-MAC) scheme. With the DTX-MAC scheme, when a particular user has a data packet to send, the base station needs to be notified by the mobile of its intention to transmit, and synchronization needs to be quickly resumed. To achieve that an access request message (ARM) is transmitted from the mobile to base station to acquire synchronism, and to inform the base station of the mobile\"\"s identity. An access reservation channel (ARC) is allocated for such a purpose. All mobiles in a cell (or sector) use the same PN code to send their ARMs on the ARC; this avoids the need to have a separate receiver for each mobile, even in the off mode. If the number of sources is large, then more than one ARC may be used.\nA MAC protocol is required on the ARC for the mobiles"} +{"output_text": ", such as silicon dioxide, silicon nitride, and silicon oxynitride, and can be filled with aluminum, aluminum alloys, copper, and copper alloys.\nThe invention provides a filled cavity structure, such as a contact or via, and process of cavity structure fill that can provides for a cavity fill at heretofore unprecedented low temperatures. The structure and process of the invention permit cavity filling with aluminum, aluminum alloys, copper, and copper alloys. Contacts and vias to be filled with such metals", "input_text": " as dielectrics in the integrated circuit, as these polymeric materials typically decompose at such high temperatures.\nIn view of the foregoing deficiencies in the prior art, it is desirable to provide an integrated circuit filling process for contacts and vias which provides for a reliable filling at relatively low temperatures, preferably on the order of about 250-450xc2x0 C. Contact and via filling at such low temperatures will permit for the use of more optimal dielectric materials which is critical to the development of sub-0.5 xcexcm technologies.\nAs a consequence of the foregoing prior art deficiencies, it is desirable to provide a process for the filling of integrated circuit contacts and vias which is operable at relatively low temperatures of no more than about 300xc2x0 C., and preferably between about 20xc2x0-275xc2x0 C., which temperature range will permit for the use of low dielectric constant (xcexa) polymers (i.e., xcexa less than xcx9c3.0), the use of which has heretofore not been possible due to the high temperatures required in prior art via filling processes.\nThe invention provides a filled cavity structure, such as a contact or via, and process of cavity structure fill that can provides for a cavity fill at heretofore unprecedented low temperatures. The structure and process of the invention permit cavity filling with aluminum, aluminum alloys, copper, and copper alloys. Contacts and vias to be filled with such metals can optionally be lined with physical vapor deposition (xe2x80x9cPVDxe2x80x9d) or chemical vapor deposition (xe2x80x9cCVDxe2x80x9d) refractory metals and/or metal alloys prior to deposition of the cavity fill material to enhance deposition of the fill metal within the cavity. The cavities to be filled can be formed through various dielectrics"} +{"output_text": " then pays the issuing bank.\nThe issuing bank presents the documentation and its proof of payment to the intermediary bank, which pays the intermediary bank. The intermediary bank then pays the paying bank.\nThe paying bank presents the documentation and its proof of payment to the intermediary bank, which pays the intermediary bank. The intermediary bank then pays the seller.\nThe seller presents the documentation and its proof of payment to the intermediary bank, which pays the intermediary bank. The", "input_text": " instructs the issuing bank to open a letter of credit in favor of the seller confirmed on its chosen paying bank. The letter of credit may be confirmed, unconfirmed or standby. In a standby letter of credit, if the transaction proceeds properly, the standby L/C expires, but if the transaction does not proceed properly, the damaged party draws on the standby L/C.\nThe issuing bank is assumed in this example to have no direct relationship with the paying bank, so the issuing bank approaches an intermediary bank which accepts the guarantee to pay of the issuing bank. The intermediary bank then approaches the paying bank, which accepts the guarantee to pay of the intermediary bank.\nThe paying bank then advises the seller that an L/C has been opened in its favor and that upon presentation of appropriate confirming documentation, including the Bill of Lading from a Shipper, the paying bank will pay the seller. In this scenario, that is, assuming a confirmed L/C, the paying bank must pay without recourse upon presentation of appropriate documentation. In other cases, the paying bank has recourse, that is, the paying bank passes the documentation to the issuing bank and obtains payment therefrom before paying the seller.\nThe seller finishes producing the goods and arranges for shipment with a shipper. Goods are passed to the shipper. The shipper transports the goods to a port of entry in the buyer\"\"s country.\nUpon receipt of goods, the shipper provides the seller with a bill of lading. The seller presents the bill of lading and other confirming documentation to its paying bank in order to collect against the L/C. After verifying that the documentation is in order, the paying bank pays the seller.\nThe paying bank presents the documentation and its proof of payment to the intermediary bank, which pays the paying bank. The intermediary bank"} +{"output_text": ". U.S. Pat. No. 5,584,483 (Moody) discloses a gaming system with a progressive jackpot. U.S. Pat. No. 5,683,085 (Hewlett) discloses a gaming system with a progressive jackpot. U.S. Pat. No. 5,947,820 (Morris et al.) discloses a gaming system with a progressive jackpot. U.S. Pat. No. 6,257,", "input_text": " would make the modification see in FIG. 1b difficult at a later date particularly since now the design intent may be baked into the model contrary to design intent.\nThe issue with the history-based approach is that design intent is incorporated and fixed at the time of model creation, which can complicate making changes later-on that were not anticipated at the time of model creation. In contrast, the history-less systems are flexible about change at a later date, but capture very little intelligence about how things are related. If modify designers determine to manually capture such intelligence at a later point in time, then, like history-based systems, that intelligence is incorporated and fixed thereby limiting further flexibility.\nThe inventors have advantageously recognized a need for a system and method to provide direct edit capabilities on a solid model where the current geometry is examined and joined with various model constraints so that dependencies are localized in real-time. 1. Field of the Invention\nThe present invention relates to video gaming apparatus, methods of play in video gaming apparatus, and novel features used in the playing of video games, especially video games with bonus features.\n2. Background of the Art\nWagering games (e.g., roulette, craps, slots, video poker, table card games, and gaming machines or computers using gaming software), including those intended primarily for play in casinos, should provide players with a sense of participation and control, the opportunity to make decisions, and reasonable odds of winning, even though the odds favor the casino, house, dealer or banker. The game must also meet the requirements of regulatory agencies.\nWagering games, including wagering games for casino play, with multiple wagering opportunities are known. U.S. Pat. Nos. 4,861,041 and 5,078,405 (both to Jones et al.) disclose methods and apparatus for progressive jackpot gaming"} +{"output_text": " of 0 to 6.\nRL05 is a straight, branched or cyclic alkylene group of 1 to 18 carbon atoms, preferably 1 to 10 carbon atoms when they form a ring. Exemplary alkylene groups are methylene, ethylene, 1,2- and 2,3-propylene, 1,2- and 2,3-butylene, 1,2- and 2,3-pentylene, 1,2- and 2,3-hexylene, 1,2-", "input_text": " examples are the substituted alkyl groups shown below. \nA pair of RL01 and RL02, RL01 and RL03, or RL02and RL03 may form a ring. Each of RL01, RL02 and RL03 is a straight or branched alkylene group of 1 to 18 carbon atoms, preferably 1 to 10 carbon atoms when they form a ring.\nRL04 is a tertiary alkyl group of 4 to 20 carbon atoms, preferably 4 to 15 carbon atoms, a trialkylsilyl group in which each alkyl moiety has 1 to 6 carbon atoms, an oxoalkyl group of 4 to 20 carbon atoms, or a group of formula (L1). Exemplary tertiary alkyl groups are tert-butyl, tert-amyl, 1,1-diethylpropyl, 2-cyclopentylpropan-2-yl, 2-clohexylpropan-2-yl, 2-(bicyclo[2.2.1]heptan-2-yl)propan-2-yl, 2-(adamantan-1-yl)propan-2-yl, 1-ethylcyclopentyl, 1-butylcyclopentyl, 1-ethylcyclohexyl, 1-butylcyclohexyl, 1-ethyl-2-cyclopentenyl, 1-ethyl-2-cyclohexenyl, 2-methyl-2-adamantyl, and 2-ethyl-2-adamantyl. Exemplary trialkylsilyl groups are trimethylsilyl, triethylsilyl, and dimethyl-tert-butylsilyl. Exemplary oxoalkyl groups are 3-oxocyclohexyl, 4-methyl-2-oxooxan-4-yl, and 5-methyl-2-oxooxolan-5-yl. Letter x is an integer"} +{"output_text": " always conducting. The other diode 2 and 4 is electrically connected in parallel with the energy storage capacitor 9. The two diodes 2 and 4 are connected in series with each other. The two diodes 2 and 4 are connected in parallel with the energy storage capacitor 9. The two diodes 2 and 4 are connected in series with each other. The two diodes 2 and 4 are connected in parallel with the energy storage capacitor 9. The two diodes 2 and 4 are connected in series with each other. The two diodes", "input_text": " respective valve arms T1, T3 and T5 as well as T2, T4 and T6, since the bridge arm elements each represent a converter valve of the polyphase converter with distributed energy stores. Each of these valve arms T1 to T6 has a number of two-pole subsystems 10 which are electrically connected in series. In this equivalent circuit, four of these subsystems 10 are shown. The number of subsystems 10 per valve arm T1,..., T6 is, however, not restricted to this illustrated number. Each junction point between two valve arms T1 and T2; T3 and T4 as well as T5 and T6 of a phase module 100 forms a respective connection L1, L2 and L3 on the AC voltage side of a phase module 100. Since, in this illustration, the converter has three phase modules 100, a three-phase load, for example a polyphase motor, can be connected to their connections L1, L2 and L3 on the AC voltage side, also referred to as load connections.\nFIG. 2 shows an equivalent circuit of one known embodiment of a two-pole subsystem 10 in more detail. The circuit arrangement shown in FIG. 3 represents a functionally completely equivalent variant. Both embodiments of a two-pole subsystem 10 are known from DE 101 03 031 A1. These known two-pole subsystems 10 each have two semiconductor switches 1 and 3 which can be turned off, in each case two diodes 2 and 4 and in each case one unipolar energy storage capacitor 9. The two semiconductor switches 1 and 3 which can be turned off are electrically connected in series, with this series circuit being connected electrically in parallel with the energy storage capacitor 9. One of the two diodes 2 and 4 is electrically connected in parallel with each semiconductor switch 1 and 3 which can be turned off such that these diodes 2 and 4 are"} +{"output_text": ", however, the buffer 13 cannot absorb the fluctuation. Consequently, the buffer 13 is required to have a capacity of, for example, 4 frames.\nIn the case where the buffer 13 is used, therefore, the capacity of the buffer 13 is increased, and the circuit scale is increased. In the case where the buffer 13 is not used, the capacity of the buffer 13 is not increased, and the circuit scale is not increased.\nIn the case where the buffer 13 is used, however", "input_text": "1 KHz. Therefore, 1.4112 Mbits/sec. is attained. It is preferable to use an integer multiple of 1.4112 MHz as the clock signal for signal processing. In consideration of decoding of CIRC, etc., usually, a fixed clock signal of 8.4672 MHz which is six times. Consequently, the clock signal for signal processing for 1 frame consists of 1,152 clock pulses.\nFIG. 63 is a diagram illustrating the operation of the buffer. The buffer 13 has a capacity of, for example,.+xe2x88x92.4 frames, and is configured so that a predetermined byte is stored at an address in the unit of a frame with using the synchronizing signal as the reference. With respect to the write address and the read address, the same addresses exist at positions which are shifted from each other by 4 frames. In the case where the write address and the read address coincide with each other, therefore, when demodulated data are written, X-demodulated data which were written at the timing preceding by four frames before are read out.\nThe demodulated data read out from the buffer 13 are transferred to a memory for storing an amount of data which is required for decoding in the CIRC decoder 21, and subjected to error correction, etc. by using the clock signal for signal processing. The CD data are sent to the CD-ROM decoder 22 and reproduced as user data.\nWhen the clock signal for signal processing is fixed as in the case of a CD player, a difference between the write address and the read address is produced in the case where disturbance causes the rotation of the disk to fluctuate and the reproducing speed of reproduced data is changed. In such a case, rotation fluctuation of 3 frames or less can be absorbed by the buffer 13. When a difference of 4 frames or more is caused by large rotation fluctuation"} +{"output_text": " the second substrate 10, 70 is sealed by a sealant 80. A plurality of gate lines (GL) and a plurality of data lines (DL) are formed on the first substrate 10, and a plurality of switching thin-film transistors (STr) are formed at intersections of the gate lines (GL) and the data lines (DL). A plurality of driving thin-film transistors (DTr) are formed at intersections of the switching thin-film transistors (STr) and the data", "input_text": "). Also, a storage capacitor (StgC) may be formed between the gate electrode and the source electrode of the driving thin-film transistor (DTr).\nWhen a signal is applied via the gate line (GL), the switching thin-film transistor (STr) is turned on, and a signal of the data line (DL) is transferred to a gate electrode of the driving thin-film transistor (DTr) to turn on the driving thin-film transistor (DTr), thereby emitting light through the organic electro-luminescence diode (D). At this time, when the driving thin-film transistor (DTr) enters an ON state, the level of a current flowing through the organic electro-luminescence diode (D) from the power line (PL) is determined, thereby determining a gray scale. The storage capacitor (StgC) may perform the role of constantly maintaining a gate voltage of the driving thin-film transistor (DTr) when the switching thin-film transistor (STr) is turned off, thereby constantly maintaining the level of the current flowing through the organic electro-luminescence diode (D) until the next frame, even if the switching thin-film transistor (STr) enters an OFF state before then. The organic electro-luminescence device performing such a driving operation may be classified into a top emission type and a bottom emission type.\nFIG. 2 is a plan view illustrating a top emission type organic electro-luminescence device, and FIG. 3 is a cross-sectional view illustrating one pixel area including a driving thin-film transistor of the top emission type organic electro-luminescence device, as a cross-sectional view of an \u201cA\u201d portion of FIG. 2. Referring to FIGS. 2 and 3, a first and a second substrate 10, 70 are disposed to face each other, and an edge portion of the first and"} +{"output_text": " the video signal and the other signals, and supplies the video signal to the video receiver 133a via the receiving photodiode (PD) 132a.\nIn the subscriber unit 105b, the wavelength division multiplexer/demultiplexer (WDM) 131b demultiplexes the input signal into the video signal and the other signals, and supplies the video signal to the video receiver 133b via the receiving photodiode (PD) 132b. On the other hand, the", "input_text": " and an A/D (Analog/Digital) converter 146a to which a facsimile machine 148a is connected. A personal computer 149a is directly connected to the electric signal multiplexer/demultiplexer 144a. The subscriber unit 105b connected to the optical fiber 104b has a similar configuration. When no video receiver is required as in the subscriber unit 105b, a terminator 135b is connected in place of the receiving photodiode (PD).\nNext, the operation will be described.\nIn the central office unit 101, the video signal generator 111 supplies its video signal to the transmitting laser diode (LD) 112. The transmitting laser diode (LD) 112 supplies it to the wavelength division multiplexer/demultiplexer (WDM) 113 in the form of the optical signal. The wavelength division multiplexer/demultiplexer (WDM) 113 multiplexes the optical signal with the optical signal from the transmitting and receiving section 114, and supplies it to the star coupler 102 via the optical fiber 103. The star coupler 102 splits the signal and supplies the split signals to the subscriber units 105a, 105b and the like.\nIn the subscriber unit 105a, the wavelength division multiplexer/demultiplexer (WDM) 131a demultiplexes the input signal into the video signal and the other signals, and supplies the video signal to the video receiver 133a via the receiving photodiode (PD) 132a. On the other hand, the signals other than the video signal are supplied to the receiving photodiode (PD) 142a via the wavelength division multiplexer/demultiplexer (WDM) 141a in the transmitting and receiving section 134a, to be converted into the electric signal. Then, the electric signal multiplexer/demultiplexer 144a demultiplexes the electric signal into"} +{"output_text": " the web path are a plurality of air jets which are directed toward the web fluttering. The air jets are arranged to create a pressure differential between the upper and lower parts of the web path, the upper part being higher in pressure than the lower. The pressure differential is created by the air jets which are directed toward the web fluttering. The air jets are arranged to create a pressure differential between the upper and lower parts of the web path, the upper part being higher in pressure than the lower.", "input_text": " jets through both walls into the space therebetween. The necessity of the walls on both sides of the web path manifests itself as a critical drawback when the apparatus is to be utilized for web flutter suppression during web splicing. At the supply roll station of a web-fed rotary printing press, for example, the space for wall installation is available only on one side of the web during splicing, the other side being occupied by a new roll against which the web now being printed is to be pressed for splicing. This prior art apparatus is therefore unapplicable to this end.\nJapanese Unexamined Utility Model Publication No. 58-83346 teaches the use of a hollow structure for conveying ultrathin sheet material therethrough. At the upstream end of this hollow structure there are provided nozzles for creating two airstreams in the upper and lower parts of its interior, the upper stream being higher in velocity than the lower. Ultrathin sheet material is pneumatically transported down the hollow structure, always floating by virtue of the pressure differential caused by the difference between the speeds of the airflows on its upper and lower sides.\nThis prior art pneumatic transportation system is well calculated to keep ultrathin sheet material straight as it travels through the hollow structure. No consideration is, however, made as to how to keep the material from fluttering. For this reason alone the system is unfit for flutter control of traveling webs, not to mention the fact that its mechanical construction inhibits its use for that purpose during web splicing for the same reasons as have been set forth in connection with the first described prior art.\nJapanese Utility Model No. 2,503,149 is explicitly designed to damp web fluttering during web splicing. Employed to this end are baffle plates for damping fluttering of the web which travels close to the new web roll to which that old web is to be spliced. Strategically positioned along"} +{"output_text": " to store the task's data.\nIn a real-time operating system (RTOS), the stack is typically implemented as a circular buffer. The circular buffer is implemented as a linked list of stack frames. Each stack frame includes a task identifier, a return address, and a stack pointer. The stack pointer points to the next stack frame in the linked list. The stack pointer is incremented each time a new stack frame is pushed onto the stack. The task identifier is used to identify the task", "input_text": " \u2062 1 \u2062 _ \u2062 2 \u2758 + -- -- -- -- S1_ \u2062 1 \u2062 _ \u2062 2 \u2062 _ \u2062 1 \u2062 + -- -- - -- S1_ \u2062 2 \nWhen a context switch occurs in subroutine S1\u20141\u20142\u20141 while task T1 is running, task T1's call... return stack will contain\ntop of stack -> return address in S1_1_2return address in S1_1return address in T1\nThis call... return stack is unique to task T1. When task T1 resumes after this context switch, each subroutine must successfully return to the position from where it was called in order for task T1 to continue executing properly. The kernel is responsible for managing all the stacks in the RTOS.\nArchitecture dictates the capabilities of a processor's stack. In a general-purpose stack (e.g. Motorola 68000 series microprocessors), the size of the stack is limited only by available RAM. Because the stack's contents are not constrained, a general-purpose stack can include subroutine return addresses. An RTOS allocates task stack memory based partially on the amount of memory required to store return addresses for the expected maximum call depth. As maximum call depth and/or the number of tasks in a system grows, so do its RAM requirements. RAM requirements per task for a typical RTOS range from hundreds of bytes to kilobytes or in some cases even megabytes per task. An RTOS may also allocate additional memory on a per-task basis"} +{"output_text": "IMSI\" or \"IMSI-like\" information. The MIN is a unique number assigned to each subscriber station by the network. The MIN is used to identify the subscriber station to the network and to authenticate the subscriber station to the network. The MIN is generally a 15-digit number which is assigned to the subscriber station by the network. The MIN is generally a unique number assigned to each subscriber station by the network. The MIN is used to identify the subscriber station to the network and to", "input_text": " metallic external appearance, such description being given solely by way of illustrative example and made with reference to the annexed drawings. Personal subscriber stations such as cellular telephones or other Personal Communication System (PCS) equipment are commonly used to communicate with other parties via a wireless communications system, such as a cellular telephone network. The ability of a personal subscriber station to access and properly operate within a wireless communications system depends in large part upon certain unique, and often secret, operational information which is programmed into each subscriber station prior to activation of wireless service, or initial use of the equipment within the wireless system. Generally, this operational information is used for such things as \"authentication\" of the subscriber station. Authentication is a procedure whereby information is exchanged between a subscriber station and a base station for purposes of enabling the base station to confirm the identity, or validity, of the subscriber station. A standardized method for authenticating a cellular subscriber station has been established by the Telecommunications Industry Association (TIA). This procedure is described in EIA/TIA Interim Standard IS-54 (IS-54) and TIA/EIA Telecommunications Systems Bulletin TSB50 (TSB50), both of which are hereby incorporated herein by reference.\nA successful outcome of the authentication process generally occurs only when it can be demonstrated that the subscriber station and the base station process identical sets of Shared Secret Data (SSD). This Shared Secret Data is generally a multi-bit pattern stored in semi-permanent memory of the subscriber station. It is, however, known to the Base Station and is calculated, or derived, based upon certain information which may include operational information unique to the subscriber station. One method of deriving SSD is more thoroughly disclosed in TIA IS-54.\nOperational information may include such things as a Mobile Identification Number (MIN), or a Personal Identification Number (PIN), sometimes referred to as \""} +{"output_text": " at least one hydroxyl group obtainable by condensing (i) primary or secondary, aliphatic, cycloaliphatic, aromatic, araliphatic or heterocyclic monoamines and/or polyamines, which amino or polyamine may contain OH--groups, (ii) carbonyl compounds and (iii) dialkyl phosphites, optionally followed by alkoxylation, and PA1 (3) aromatic hydroxy carboxylic acids or salts thereof and PA1 (4) optionally water and/or other", "input_text": " elasticity. Elastic-plastic intumescent compositions characterized by high dimensional stability could be widely used in the field of fire prevention in the form of semi-finished products, such as tapes, sheetings, profiles, coatings, granulates or fillings.\nThe use of melamine in the production of flexible foams using substantially linear polyols, and preferably polyether polyols, is known (see e.g. German Offenlegungsschrift No. 2,815,554). Although foams of this type are flame-resistant and do not burn completely on exposure to a flame, they do not have the character of intumescent compositions. In other words, they do not undergo any increase in volume on exposure to a flame, forming a fire-repellent foam.\nAccording to an earlier proposal, flame-resistant sealing compounds free from phosphorus and halogen may be produced using branched polyesters containing hydroxyl groups. Sealing compounds of this type are not intumescent, i.e., they do not foam on exposure to a flame.\nIn yet another proposal optionally foamed intumescent compositions are obtained by reacting:\n(1) polyisocyanates with PA1 (2) phosphorus-containing condensation products having at least one hydroxyl group obtainable by condensing (i) primary or secondary, aliphatic, cycloaliphatic, aromatic, araliphatic or heterocyclic monoamines and/or polyamines, which amino or polyamine may contain OH--groups, (ii) carbonyl compounds and (iii) dialkyl phosphites, optionally followed by alkoxylation, and PA1 (3) aromatic hydroxy carboxylic acids or salts thereof and PA1 (4) optionally water and/or other organic compounds containing isocyanate-reactive hydrogen atoms. PA1 (1) polyisocyanates with PA1 (2) phosphorus-containing condensation products having"} +{"output_text": " very attractive fuel because it is clean and can be used to efficiently produce electricity in a fuel cell. A hydrogen fuel cell is an electro-chemical device that includes an anode and a cathode with an electrolyte therebetween. The anode receives hydrogen gas and the cathode receives oxygen or air. The hydrogen gas is dissociated in the anode to generate free hydrogen protons and electrons. The hydrogen protons pass through the electrolyte to the cathode. The hydrogen protons react with the oxygen and the electrons in the cathode to generate water", "input_text": " of the existing two must be actuated, saving water if finding the correct push button or the suitable part thereof.\nTherefore, it depends on the user\"\"s will to produce the saving, and the second pulsation being carried out or not or correctly actuating the conservation mechanism is at the expense of carelessness, negligence, forgetfulness, unknowing or comfort.\nThere are even double discharge mechanisms with only one push button, in those which a first pulsation produces a partial discharge regardless of the torrent or pressure with which the water evacuates and therefore a forced and quantifiable water saving, but it is necessary to carry out a second pulsation that must also be maintained or continued for a period of time in order to produce the complete removal from the deposit.\nThe present invention overcomes the previously mentioned drawbacks in a simple manner, by means of only one water outlet valve, only one push button and one float. With a structurally heavy-duty device adaptable to any type of toilet tank, a measured and fixed volume of water is discharged by the action of pushing or pulling only once, while the complete discharge occurs when the push button or pull rod is actuated a second time in a prolonged manner. The system uses the coupling of simple mechanical devices physically separating on one side the opening and closing of the water outlet valve towards the toilet, and on the other, the filling of water in the system. 1. Field of the Invention\nThis invention relates generally to a system and method for monitoring the performance of fuel cells in a fuel cell stack and, more particularly, to a system and method for monitoring the performance of fuel cells in a fuel cell stack that includes a sensor for detecting an undesirable condition of the fuel cells and a tone generator that generates an AC tone in response to the detected condition that can be detected by suitable circuitry.\n2. Discussion of the Related Art\nHydrogen is a"} +{"output_text": " video recording device to an officer's uniform. For example, the video recording device may be easily dislodged or removed from the officer's uniform. The officer may be required to remove the video recording device from his or her uniform and then reattach it to the uniform. This may be inconvenient and time consuming. Further, the video recording device may be easily lost or misplaced.\nThere are also disadvantages to attaching a microphone/speaker device to an officer's uniform. For example, the microphone", "input_text": "speaker device in a position to be readily grasped and used by the officer. Alternatively, the microphone/speaker device may be secured by a strap that can be attached to the epaulette or shoulder yoke strap of the uniform. One known device that allows for a combination microphone/speaker device to be attached to a uniform is referred to by the trademark \u201cWalkieclip.\u201d With such devices, a mounting strap is secured to a button on the epaulette or the epaulette itself so as to hang down the front of the uniform shirt. The mounting strap receives or accepts a clip or like attachment device on the back-side of the microphone/speaker device (or pin on the back of a holder for the device) so as to position the microphone/speaker at or near the breast pocket of the officer's uniform shirt.\nSimilarly, it is known to provide a clip or pin that can be used to attach a video recording device to an officer's clothing. For example, a spring loaded clip attachment is known. Such an attachment device receives the video recording device on one side, and includes a spring-loaded clip on the other side. The clip (or pin) allows the officer to attach the camera directly to the front of his or her uniform shirt. Alternatively, it is known to provide a lanyard or like device that can be placed about the officer's neck to hold and support the video recording device. In that and the clip or pin methods of attachment, the video recording device's position may be disrupted, rendering it possibly useless or ineffective. If on a lanyard, the device's position may be altered simply by running. To be useful, the camera or video recording device would preferably be directed to the officer's front in a direction that would allow the camera to record or otherwise capture essentially what the officer is seeing.\nThere are disadvantages to attaching a"} +{"output_text": " the potassium chloride grains with a solution.\nThe following steps were carried out for enhancing or optimizing the procedure.\nThe potassium chloride grains are treated with a solution of sodium chloride, potassium chloride and magnesium chloride. The magnesium chloride is added in order to increase the solubility of the potassium chloride in the solution. The sodium chloride is added in order to increase the solubility of the potassium chloride in the solution. The potassium chloride is added in order to increase the solubility of the potassium chloride in the solution. The", "input_text": " this process, the product obtained remains finely granular, overall.\nIn U.S. Pat. No. 681,407, recrystallization is carried out by heating the suspension under pressure. The subsequent relief and introduction of new components such as, for example, magnesium ions, lead to salting out.\nBased on the current state of knowledge and technology for the purification of potassium chloride with potassium chloride solutions, the following steps were carried out for enhancing or optimizing the procedure.\nAs opposed to German PS 3,129,042, where it is assumed that \"the inorganic salt impurities are generally evenly distributed over the potassium chloride particles,\" one has to expect in the purification of potassium chloride crystals a non-uniform distribution of the impurities, depending on the grain size. The coarser crystals are contaminated more by inclusions of mother liquid. On the other hand, with the smaller grains, which have a larger specific surface, the adhering amount of secondary components present in the solution increases. From this follows that the grains of different grain sizes have to be treated differently in order to achieve a uniform reduction of the impurities.\nThe extraction of the foreign substances by treating the potassium chloride grains with a solution is a heterogeneous process. The lower chemical potential of the impurities in the solution is the driving force permitting the emigration of the foreign substances until an equilibrium is adjusted. This is a typical diffusion process in which the rate decreases as the process progresses because profiles of concentration develop from the interior of the grains. In order to achieve a uniform elimination of the impurities, coarser grains have to be extracted longer than the finer grains under the same conditions.\nMicroscopic and other investigations have shown that in the crystallization of potassium chloride from aqueous solutions, no monocrystals are formed, but rather aggregates and agglomerates whose original particles can have different dimensions. Therefore, a specific behavior has to be expected in the treatment of"} +{"output_text": " Raman spectroscopy. Need also exists for improved technologies for conducting Raman analysis of materials over optical fibers and for improved optical fibers and optical waveguides and associated coupling optics. Need is apparent for a system that can transmit source light from a source towards a sample and transmit sample light from the sample towards a detector while avoiding or mitigating interference from lightmatter interactions occurring during transmission. Need also exists for improved technologies for conducting Raman analysis of materials remotely via Raman analysis, including in vivo, in situ, in vitro,", "input_text": " can impose problematic background, artifacts, and/or interference on a spectrum.\nAnother issue facing many conventional technologies concerns coupling light between the distal end of one or more optical fibers and a sample when the media between that distal end and the sample scatters light, absorbs light, or has challenging light transmission characteristics. In this situation, such challenging media may produce a signal that interferes with the signal of interest from the sample. As another potential problem, the challenging media may diffuse or attenuate the light traveling towards or away from the sample. The challenging media may smear spatial resolution, distort acquired images, or otherwise disturb, confuse, complicate, or confound an analysis.\nIn view of the foregoing discussion of representative deficiencies in the art, a need exists for improved technologies for analyzing materials over optical fibers and for improved optical fibers and optical waveguides and associated coupling optics. Need is apparent for a system that can transmit source light from a source towards a sample and transmit sample light from the sample towards a detector while avoiding or mitigating interference from lightmatter interactions occurring during transmission. Further need exists for improved technologies for analyzing materials remotely via Raman analysis, including in vivo, in situ, in vitro, and/or ex vivo Raman spectroscopy. Additional need exists for improved technologies for conducting optical coherence tomography (\u201cOCT\u201d), surface enhanced Raman spectroscopy (\u201cSERS\u201d), near infrared (\u201cNIR\u201d) analysis, ultraviolet (\u201cUV\u201d) or visible (\u201cVIS\u201d) spectroscopy, UV resonant Raman spectroscopy (\u201cUVRRS\u201d), imaging, surface plasmon resonance (\u201cSPR\u201d), coherent anti-Stokes Raman scattering (\u201cCARS\u201d), anti-Stokes Raman, Fourier transform Raman (\u201cFT-Raman\u201d), elastic scattering, laser Doppler shift, hyperspectral imaging, surface enhanced resonance Raman spectroscopy (\u201cSERRS\u201d), stimulated Raman, spontaneous Raman, spatially offset Raman spectroscopy (\u201cSORS\u201d), hyper Raman, or some other appropriate analytical technique or instrumentation that utilizes"} +{"output_text": ". The player is awarded a bonus payout if the indicia presented at the pay line matches the pre-selected bonus outcome.\nU.S. Pat. No. 6,059,658 describes a device and method for playing a primary and a secondary bonus game. The device includes a primary game device and a secondary game device having a display having five concentrically arranged wheels each having an indicia of an Ace, King, Queen, Jack, Ten and a wild symbol. In response to", "input_text": " continue the risk of the wager, surrender and forfeit half of the wager, double the wager or triple the wager when the two face up cards are a pair. When the player does not choose to surrender, the player is dealt two additional cards. The player designates one of his cards as a Joker whereby the player has a Poker hand comprised of four cards and a Joker. The dealer is dealt three additional cards. The dealer designates one of his cards as a Joker whereby the dealer has a Poker hand comprised of four cards and a Joker. A payout is made to the player when the player's hand has a rank that is at least as high as the rank of the dealer's hand. The player may participate in ajackpot by contributing money to ajackpot pool prior to cards being dealt. A payout from the pool is based upon the rank of the player's hand.\nU.K. Patent Application GB 2 222 712 A published Mar. 14, 1990 sets forth a slot machine main game interconnected with a slot machine secondary game. The player has the option of pushing button 18 which debits his credit meter by the appropriate amount to play the secondary game such as another slot game. Hence, the player must gamble an amount in order to play the bonus game.\nU.S. Pat. No. 6,059,658 describes a device and method for playing a primary and a secondary bonus game. The device includes a primary game device and a secondary game device having a display having five concentrically arranged wheels each having an indicia of an Ace, King, Queen, Jack, Ten and a wild symbol. In response to receiving a pre-selected bonus outcome during play of the primary game device, the secondary game device is actuated to rotate the wheels and randomly present an indicia from each wheel at a pay line"} +{"output_text": " the table surface. Accordingly, the table cloth is not effective for the prevention of dirt on the dress or carpet.\nThe table cloth of Utility Model Registration No. 3067792 discloses a table cloth which is used both as a bib and table cloth and has attached a string on one side of a vinyl cloth covered on a table surface. The cover is not a novel table cover because it is merely a large size bib and further does not have a constant space or surplus between the vinyl cloth for the", "input_text": "\nAn exclusive apron or bib providing a large pocket is objectionable to the infant or the deformed old person but could attain a certain effect for the prevention of dirt on the dress or carpet however, the spilled food thereon could not be received into the pocket because the infant does not take the meal quietly, and the pocket is smashed because it is attached at the front portion and moreover the meal is taken at a position such as to overlap the table. Accordingly, such an apron or bib for the meal is not effective for them. Contrary, the deformed old person takes the meal calmly, however, most of them dislike using such a private apron or bib because it is unattractive and gives him a sense of incongruity with the family during use.\nAdditionally, the table cover disclosed in Utility Model Registration No. 3019803 comprises a table cloth covered on a table surface and the table cloth pieces formed respectively on four sides of the former hem for also receiving crumbs and spilled liquids. However, they can merely receive the spilled crumbs and liquids on the table surface and those crumbs and liquids have to be scraped down into the hems after the latter is closed by tape fasteners which are released every time. Accordingly, the table cloth is inconvenient to use and has other shortages because the prevention of the dirt on a dress or carpet is impossible due to the hem portions being attached shortly around the dining table. Disposal of the table cloth is also very much troublesome.\nThe table cloth of Utility Model Registration No. 3067791 discloses a rectangular table cloth which is used both as a bib and table cloth and has attached a string on one side of a vinyl cloth covered on a table surface. The cover is not a novel table cover because it is merely a large size bib and further does not have a constant space or surplus between the vinyl cloth for the prevention of dirt and a portion of"} +{"output_text": " the wafer, the wafer can be severed into individual chips.\nThe aforementioned U.S. Pat. No. 5,518,964 and the corresponding PCT International Publication WO 96/02068, the disclosure of which is also hereby incorporated by reference herein, disclose processes in which circuit elements such as microelectronic connection components are fabricated in the form of a wafer-size sheet. In certain processes disclosed in the '964 patent, a sheet of a starting material such as a flexible dielectric sheet", "input_text": " design of the finished product.\nThe aforementioned U.S. Pat. No. 5,518,964 and the corresponding PCT International Publication WO 96/02068, the disclosure of which is also hereby incorporated by reference herein, disclose processes in which circuit elements such as microelectronic connection components are fabricated in the form of a wafer-size sheet. In certain processes disclosed in the '964 patent, a sheet of a starting material such as a flexible dielectric sheet with metallic layers thereon is stretched and bonded to a rigid frame having an opening or aperture therein so that the sheet is held taut by the rigid frame and maintained under tension by the frame. The frame may be in the form of a ring. The ring may be formed from a material such as molybdenum, which has a coefficient of thermal expansion close to that of a silicon semiconductor wafer, and lower than the coefficient of expansion of the sheet. The sheet may be stretched and attached as by bonding to the ring at an elevated temperature, so that the sheet remains in tension during processing at lower temperatures. While the sheet is held in the ring, it is accessible from both sides. The sheet is treated using various circuit-fabrication techniques such as etching and plating using photographically patterned resists. Because the sheet is maintained under tension throughout the process, it remains dimensionally stable. Because the sheet is accessible from both sides, fabrication of the sheet, and mounting of the sheet to the wafer can be performed readily. The features formed on the sheet are precisely positioned relative to one another over the entire extent of the sheet.\nAfter processing, the entire sheet, with the rings still attached, can be aligned with a large assemblage of semiconductor chips such as a unitary semiconductor wafer. Leads formed during the fabrication process can be connected to all of the chips on the wafer. After connection, and after other processes such as deformation of leads on"} +{"output_text": " of a MEMS device is a problem that has not been solved yet. The present invention provides a solution to this problem.", "input_text": " since the step of formation of the NEG deposit takes place almost at the end of the process, the getter is deposited externally to the cavity. Small openings are provided in the walls of the support to allow gettering through the wall openings. The amount of getter available through the holes is extremely limited, and the getter life is reduced by its placement outside of the cavity.\nU.S. Pat. No. 5,701,008 discloses a microbolometer manufactured starting from two supports and containing a getter material. As far as the description of the manufacturing process is concerned, however, this document refers to a previous U.S. Pat. No. 5,433,639, which relates to the manufacturing process of a sensor of infrared radiation of traditional type (not a MEMS) wherein the different components are manufactured in parallel and assembled at the end. As such, the process of U.S. Pat. No. 5,433,639 is not directly applicable to U.S. Pat. No. 5,701,008, at least with respect to the integration of the getter within a cavity, and therefore this last document is of limited value in addressing the above-identified problems.\nU.S. Pat. No. 6,590,850 mentions the general use of a getter in a MEMS and discloses the location thereof, but it does not disclose the manufacturing process of the devices and consequently does not mention how to introduce the getter therein. U.S. Pat. No. 5,952,572 is even more indefinite, mentioning only the use of a NEG, a combination between titanium and an alloy Zr\u2014V\u2014Fe, without disclosing either the location of the getter in the cavity or, even less, the step of introducing the NEG in the cavity.\nThe efficient integration of getter material into a chamber"} +{"output_text": " invention, U.S. Pat. No. 4,234,011 discloses a plastic valve which is designed to operate at higher pressure ratings than those of the present invention. The valve of the '011 patent includes a valve body having a valve seat and a valve plug which is movable between a closed position and an open position. The valve plug is biased toward the closed position by a spring which is disposed within the valve body. The valve plug is held in the closed position by a valve seat which", "input_text": ". Field of the Invention\nThis invention relates to a reinforced plastic valve and, more specifically, to a polyethylene plastic valve which has anti-creep properties and is capable of operating at higher pressure ratings for an extended period of time.\n2. Description of the Prior Art\nPlastic valves, such as those disclosed in U.S. Pat. Nos. 4,014,513; 4,171,711; 4,234,011; and 4,488,741, have recently been satisfactorily and successfully employed for the flow control of numerous types of fluids in various piping systems and in a wide range of environmental conditions. However, because of the nature of plastic, there have heretofore been some limitations on the amount of fluid pressure which should be allowed in systems which employ plastic valves. For example, it has been found that, when various plastic valves have been utilized in systems which have a relatively high operating pressure, after an extended period of time, the valve plug and/or valve body can experience \"creep\" which alters the design dimensions of the valve and/or plug to decrease its overall efficiency and reliability.\n\"Creep\" can be defined as progressive strain without increased stress. If one is free to select alternative materials of construction, it is possible to eliminate any real concern for \"creep\". However, there are instances where the plastic body material must be identical to that of the piping system. For example, if the body is to be fused to the pipes in the system, the same material is required for a proper union. The piping could display high \"creep\" characteristics and still be reliable while the same \"creep\" in the body could alter its dimensions and reduce the reliability of sealing around the plug which prevents leakage and sealing at the valve seat which controls flow through the valve.\nAlthough not specifically related to the type of valve of the present"} +{"output_text": " FIG. 7B is described. The laser light is divided in a direction perpendicular to the first direction by a cylindrical lens group (hereinafter referred to as a cylindrical lens array) 107, thereby determining a length of linear laser light in a direction perpendicular to the first direction. The direction is called a second direction in this specification. It is assumed that, when a mirror is inserted in a course of an optical system, the second direction is changed in accordance with a direction of light bent by the mirror.", "input_text": " more (preferably, 100 to 10000). Note that the linear shape is used to obtain an energy density required for sufficiently annealing an object to be irradiated. Thus, if sufficient annealing is conducted for the object to be irradiated, it may be a rectangular shape or a sheet shape. Under the present conditions, an excimer laser of 15 J/pulse is on the market. In the future, there is also a possibility that annealing with sheet shaped laser light is conducted.\nFIGS. 7A and 7B show an example of a configuration of an optical system for forming laser light in a linear shape on a surface to be irradiated. This configuration is extremely general. All optical systems described above are based on the configuration shown in FIGS. 7A and 7B. According to the configuration, a cross sectional shape of laser light is converted into a linear shape, and simultaneously an energy density distribution of laser light on the surface to be irradiated is homogenized. In general, an optical system for homogenizing the energy density distribution of laser light is called a beam homogenizer.\nLaser light emitted from a laser 101 is divided in a direction perpendicular to a traveling direction thereof by a cylindrical lens group (hereinafter referred to as a cylindrical lens array) 103, thereby determining a length of linear laser light in a longitudinal direction. The direction is called a first direction in this specification. It is assumed that, when a mirror is inserted in a course of an optical system, the first direction is changed in accordance with a direction of light bent by the mirror. In the configuration shown in the top view of FIG. 7A, the cylindrical lens array is divided into seven parts. Then, the laser lights are synthesized on a surface to be irradiated 109 by a cylindrical lens 105, thereby homogenizing an energy density distribution of the linear laser light in the longitudinal direction.\nNext, the configuration shown in the cross sectional view of"} +{"output_text": " a strip chart.\nThe electrocardiograph is a very useful tool in diagnosing the condition of a patient's heart. However, the electrocardiograph is not without its drawbacks. For example, the electrocardiograph is not a portable device. It is not designed to be carried by a patient, but rather is designed to be placed upon a patient. The electrocardiograph is also not designed to be used in a variety of different environments. For example, the electrocardiograph is not designed to", "input_text": " electronic device and in designing the stylus as a mediator of interaction with portable electronic devices. This can significantly extend functionalities of the stylus-based interaction that has been realized in the present invention. 1. Field of Invention\nThis invention relates to a electrocardiogram (ECG) diagnostic device, and more specifically, to a disposable ECG diagnostic chest pad having pre-positioned lead electrodes and internal wiring which may be placed quickly as a single unit upon a patient and may be separated into two sections by way of perforated sections in the underling pad material, thus allowing greater flexibility in monitoring and diagnosing a patient's electrocardiogram waves.\n2. Description of Prior Art\nIt has been long known in the medical community that the current condition, and possibly future state, of a subject patient's heart muscle can be ascertained by measuring the cardiac electrical activity. The electrical system of the heart not only initiates and controls the rate of heartbeat, but also coordinate to transmission in the most efficient mechanical manner. When such electric signals are irregular, it is a sign of cardiac problems, particularly cardiac arrest, better known as a \u201cheart attack.\u201d\nLike all electrical signals, the electrical signals generated by the heart can be expressed as a wave or a series of waves having a frequency and amplitude. Again, like all electrical signals, these waves can be detected and measured\u2014in this case by an electrocardiograph. Electrodes, generally pads containing conductive material, such as silver chloride and an adhesive are attached to the trunk and limbs of a patient's body. The electrodes are in turn attached to \u201cleads\u201d or cables which are connected to the electrocardiograph. Generally, in modem medical practice a ten-lead electrocardiograph is used to produce twelve lead measurements through the use of bipolar electrodes. The electrocardiograph receives the signals from the leads, processes them and outputs the resulting waveform patterns, usually on"} +{"output_text": "ATA and QDATA is selected by the phase detection logic 13, and then provided to the loop filter 14.\nThe loop filter 14 filters the selected n-bit parallel data IDATA and QDATA, and provides the filtered n-bit parallel data IDATA and QDATA to the phase interpolation controller 15. The phase interpolation controller 15 generates four recovery clock signals that respectively have a frequency of f/2 Hz and a phase difference of about 90\u00b0 between each other, based on the filtered n-", "input_text": " high-speed logic circuit and by increasing pipeline steps; however, the chip size or power consumption may be considerably increased.\nFIG. 1 is a block diagram illustrating a conventional CDR circuit. The conventional CDR circuit of FIG. 1 converts high-speed serial data into low-speed parallel data, and then detects a phase difference of the converted parallel data.\nReferring to FIG. 1, the CDR circuit includes a sampler 11, a deserializer (serial-parallel converter) 12, a phase detection logic 13, a loop filter 14, a phase interpolation controller 15, a phase interpolator 16, a frequency divider 17, and a phase-locked loop 18.\nThe phase-locked loop 18 generates four reference clock signals that respectively have a frequency of f/2 Hz and a phase difference of about 90\u00b0 between each other. The phase interpolator 16 receives the reference clock signals and adjusts the phases, to generate four recovery clock signals that respectively have a frequency of f/2 Hz and a phase difference of about 90\u00b0 between each other. The phase interpolator 16 provides the recovery clock signals to the sampler 11.\nThe frequency divider 17 lowers an inputted frequency by 1/n, and outputs the lowered frequency. That is, the frequency divider 17 transforms the inputted f/2 Hz clock signal into an f/2n Hz clock signal, and then provides the f/2n Hz clock signal as an operating clock of the deserializer 12, the phase detection logic 13, the loop filter 14 and the phase interpolation controller 15.\nThe sampler 11 samples serial data INPUT having f bps, and provides a sampled signal to the deserializer 12. The deserializer 12 transforms the sampled signal into two n-bit parallel data IDATA and QDATA. At least one of the transformed n-bit parallel data ID"} +{"output_text": " by heating. The frame member is then removed from the groove and the glass panes are glazed by inserting them into the frame member and filling the frame member with a heat-hardenable sealing mass of the polysulfide type. The frame member is then removed from the groove and the glass panes are glazed by inserting them into the frame member and filling the frame member with a heat-hardenable sealing mass of the polysulfide type. The frame member is then removed from the groove", "input_text": " cases where exact depth-control and uniformity of the grafted region is important, such as for example in the surface modification of contact lenses, such uncontrollable grafting reactions are not acceptable. On the other hand, if to reduce inhomogeneities grafting is carried out for a short time only, the grafted surface regions are too thin and in many applications the desired effect soon wears off. Exact control over reaction conditions is therefore very important.\nIt has now also been discovered, that polysiloxane-polyurethane rubbers are especially well suited to make soft contact lenses not only with excellent oxygen permeability, but excellent wettability and hydrogel-like softness as well, when they are prepared in contact lens molds which have previously been coated with a reactive hydrophilic polymer, which is transfer-grafted during cure.\nIt has been further discovered, that polysiloxane-polyurethane rubbers can be made in form of an interpenetrating polymer network (IPN) with a free-radical polymerized vinyl polymer; these IPN's are often clear and besides being highly oxygen permeable, allow the physical properties of the polysiloxane-polyurethane rubber to be varied over a wide range; they include water swellable compositions and compositions bearing polar groups which are otherwise difficult to incorporate into a polyurethane. The field of this invention lies in the glazing of a window frame and a glass pane, in particular, a double glass pane, which pane is inserted into a window frame groove.\nIn the known glazing process which is described in U.S. Pat. No. 3,667,179, double glass panes are glazed by inserting them into the groove of a horizontally arranged frame member and the groove or slot is filled with a heat-hardenable sealing mass of the polysulfide type, after which such mass is hardened"} +{"output_text": ", Tex., and the following U.S. Patents, entire copies of which are incorporated herein by reference:\nU.S. Pat. No. 6,431,282; U.S. Pat. No. 6,431,281; U.S. Pat. No. 6,431,280; U.S. Pat. No. 6,431,279; U.S. Pat. No. 6,431,278; U.S. Pat.", "input_text": " regional development. However, centralized surface operations with fixed facilities require a paradigm shift in development drilling operations. The well drilling and maintenance equipment would not normally be mobile (except offshore on vessels) and it would normally spend its entire working life from one location.\nSeveral references are cited below related to the topics of expandable casing, methods to expand tubulars and casings, fabricating composite umbilicals, and well management systems.\nRelevant references to expandable casing includes U.S. Pat. No. 5,667,011, entitled \u201cMethod of Creating a Casing in a Borehole\u201d, which issued on Sep. 16, 1997, that is assigned to Shell Oil Company of Houston, Tex., and the following U.S. Patents, entire copies of which are incorporated herein by reference:\nU.S. Pat. No. 5,366,012; U.S. Pat. No. 5,348,095; U.S. Pat. No. 5,240,074; U.S. Pat. No. 4,716,965; U.S. Pat. No. 4,501,327; U.S. Pat. No. 4,495,997; U.S. Pat. No. 3,958,637; U.S. Pat. No. 3,203,451; U.S. Pat. No. 3,172,618; U.S. Pat. No. 3,052,298; U.S. Pat. No. 2,447,629; U.S. Pat. No. 2,207,478\nRelevant references to expandable casing also includes U.S. Pat. No. 6,431,282, entitled \u201cMethod for Annular Sealing\u201d, which issued on Aug. 13, 2002, that is assigned to Shell Oil Company of Houston"} +{"output_text": " efficient transfer of power to the load. Third, balanced power systems have zero neutral voltage. This ensures that the neutral point of the system is at the same potential as the other phases. Fourth, balanced power systems have zero neutral current. This ensures that the neutral point of the system is at the same potential as the other phases.\nThe three-phase power system is a balanced system because the three phases are 120\u00b0 out of phase with each other. The three phases are 120\u00b0 out of phase", "input_text": " the observers. The inner workings of the system must be off limits to people unless properly trained to handle such equipment.\nThere is a need for a three-dimensional back-projection display system that can overcome, inter alia, the limitations of the prior art by eliminating focal difficulties, cumbersome mirror assemblies, custom optics requirements, costly and impractical light source cooling, and distracting shadows cast onto the display surface. Three-phase electric power is a common method of electric power transmission and is implemented by three conductors each carrying voltage waveforms 2\u03c0/3 radians (120\u00b0 or \u2153 of a cycle) offset in time. Public power facilities that deliver electric power to domestic, industrial, and commercial buildings in most nations including, e.g. the United States and much of Europe, generate three-phase power. Although voltage produced by these power facilities typically varies throughout the world, e.g. 460V 60 Hz in the United States versus 230V 50 Hz in much of Europe, the 120\u00b0 phase separation of a three-phase system is always approximately constant, as it is a defining characteristic of three-phase power.\nThree-phase electric power is appropriate for various forms of electric equipment including e.g., motor drives, appliances, boilers, space heaters, electric arc furnaces, rail cars, air conditioning units etc. Typically, these applications are designed around a balanced three-phase power input. Balanced power is a result of all three phases having a substantially identical voltage and a 120\u00b0 shift in phase with respect to each other. Balanced three-phase power has several distinguishing characteristics. First, balanced power provides constant, non-time varying electric power transfer. Constant power transfer is a desirable condition, as large motor drives and generators will run much more smoothly on constant power than on varying input power. Second, balanced power systems have zero neutral current. This ensures a more"} +{"output_text": " personnel.\nU.S. Pat. No. 6,975,913 teaches a method and apparatus for locating a mobile device using a cellular network. The method includes the steps of receiving a request for a location of a mobile device, receiving a location of the mobile device from a cellular network, and determining a location of the mobile device using the cellular network. The method also includes the steps of determining a location of the mobile device using a GPS system, and determining a location of the", "input_text": " it is commonly known) is comprised of a number of satellites circling the Earth that radiate timing signals that are controlled by a network of ground stations. By measuring the arrival of these signals at a receiver, it is possible to determine the location of the receiver to very high precision. While Mohan (U.S. Pat. No. 6,121,922) teaches the combined use of GPS and a mobile transmitter in a compact form-factor, Grimm extended this concept to include a RF beacon that allowed for the local isolation of stolen device with far greater precision than can be achieved with a cellular network location or GPS position fix. U.S. Pat. Nos. 6,665,613, 6,480,147, 6,271,757 teach different variations of defining regions within a tracking device which upon the device getting a GPS location (or any kind of position determination) outside of the region certain actions will occur including, but not limited to, reporting to a network, sending alerts, or sounding an alarm. All these patents teach that there is some alarm mechanism in which some security organization or law enforcement agency can be alerted to a theft of the asset under protection by the tracking device. However, none of these references teaches how such a device came to be installed at a particular fixed location and how a device is associated with a law enforcement agency in a given jurisdiction. It is unclear if these devices were pre-programmed at the factory to be matched with their end location. Since security or law-enforcement agencies need to respond to a particular location (for example, a specific bank branch involved in a robbery), it is clear that some relationship existing between the asset tracking and the location must be established but no information is given on this method or system. These patents also do not teach how these security devices could be serviced and tested without sending non-robbery related alerts to security"} +{"output_text": " N by driving RxClav high at time t2. The ATM device checks RxClav at time t3. If RxClav is high, the ATM device can select the PHY device having address N and receive data from this PHY device. If RxClav is low, the ATM device must wait for the next clock cycle to poll the next PHY device.\nThe UTOPIA interface is a synchronous interface. The PHY layer device must respond to the ATM layer device", "input_text": "xAddr, and TxSOC signals.\nSimilarly, RxClk is the receive clock signal that is used to clock control signals and data in the receive direction (from the PHY device to the ATM device). RxData[15:0] is a 16-bit UTOPIA Receive bus. The assertion of RxEnb* is coincident with the start of the cell transfer. RxSOC is used to indicate the start of cell position. RxClav is used to indicate that the PHY layer device is ready to Receive a cell from the ATM layer device. RxAddr[4:0] is the UTOPIA address of the PHY device and is used by the ATM device to poll and select the appropriate PHY device in the receive direction.\nAt the UTOPIA receive interface, the ATM layer device polls the RxClav status of a PHY layer device by placing a specified address on RxAddr bus for one clock cycle. The PHY layer device which is associated with the address on the RxAddr bus drives RxClav high (or low) during the next clock cycle during which the ATM device places a null address (1F) on the RxAddr bus. The ATM layer device checks RxClav at a certain time after it issues RxAddr. Based on polled RxClav information, the ATM layer device can select a PHY device and receive data from this PHY device by driving RxEnb* and RxAddr signals.\nCertain timing requirements must be met for the Multiplexed Status Polling operation of the UTOPIA interface so that the ATM layer device can correctly detect Clav (Cell buffer available) information. FIG. 2 depicts the timing requirement for UTOPIA level 2 address polling. An ATM device starts driving UTOPIA address N at time t1. The PHY device having address N responds to address"} +{"output_text": "Still yet another object of the present invention is to prevent the overheating of a projection weld nut by preventing the application of subsequent welding power to the projection weld nut.\nStill yet another object of the present invention is to prevent the overheating of a projection weld nut by preventing the application of subsequent welding power to the projection weld nut.\nStill yet another object of the present invention is to prevent the overheating of a projection weld nut by preventing the application of subsequent welding power", "input_text": " obstacle in resistance welding occurs when an electrode becomes fused to a welding surface after completion of a weld. This condition is known as a xe2x80x9cstuck gun conditionxe2x80x9d. If the welding system does not detect the stuck gun before attempting to move the electrodes from a closed welding position to an opened position, extensive damage to the electrodes, weld gun, a work cell, and even human weld operators may occur.\nAnother consideration in resistance welding is to ensure that the electrodes apply welding current to the projection weld nut once and only once. The strength of the weld between a projection weld nut presently welded to a workpiece is substantially weakened with the application of subsequent electric welding power. The subsequent welding power causes the projections to overheat, and thus become brittle.\nIt is an object of the present invention to determine whether the desired surface of a projection weld nut is in a proper fit up position with a workpiece prior to applying weld power.\nAnother object of the present invention is to ensure that a projection weld nut forms a strong weld with a workpiece, to monitor and to control the formation of the weld, and to analyze the quality of the weld.\nA further object of the present invention is to determine whether a projection weld nut is properly aligned with a workpiece prior to applying weld power.\nStill another object of the present invention is to check for a stuck gun condition before moving the electrodes from a closed welding position to an opened position.\nStill yet a further object of the present invention is to prevent the shorting of the welding electrode by avoiding the application of electrical power to an electrode not engaged with a projection weld nut.\nStill yet another object of the present invention is to prevent the subsequent application of electric power to a projection weld nut presently welded to a workpiece.\n"} +{"output_text": " of the segments, and the sealing is effected by means of a gasket.\nU.S. Pat. No. 4,986,941 to Krill et al describes a segmented radiant surface burner formed of large cylindrical segments that are bolted together in axial alignment. This arrangement of large burner segments was conceived to fit the peculiar shape of combustion chambers of fire tube boilers. The serial alignment involves sealing between the abutted ends of the segments, and the sealing is effected by", "input_text": " heating modulation.\nAssignee\"\"s U.S. Pat. No. 6,199,364 to Kendall et al discloses compact surface-stabilized gas burners that are well suited for use with gas turbines. Surface-stabilized gas burners are therein defined as having burner faces with dual porosities so that surface combustion from the lower porosity areas serves to keep the blue flames from the higher porosity areas attached to the burner face when fired at rates of at least about 500,000 BTU/hr/sf (British Thermal Units per hour per square foot) of burner face.\nA principal object of this invention is to provide compact surface-stabilized gas burners featuring a broad range of heat delivery.\nAnother important object is to provide such surface-stabilized gas burners with internal walls that divide each burner into two or more segments that can be individually and independently fired to vary the thermal output, while maintaining the adiabatic flame temperature of the fired segments in a range yielding low emissions.\nStill another object is to provide segmented surface-stabilized gas burners that are simple in construction as well as operation.\nThese and other features and advantages of the invention will be apparent from the description which follows.\nBasically, the segmented surface-stabilized gas burner of this invention which has a combustion surface formed of metal and/or ceramic fibers may have a unitary body with internal partitions to provide independent burner segments, or it may have two or more burner modules that are compactly fitted together.\nU.S. Pat. No. 4,543,940 to Krill et al describes a segmented radiant surface burner formed of large cylindrical segments that are bolted together in axial alignment. This arrangement of large burner segments was conceived to fit the peculiar shape of combustion chambers of fire tube boilers. The serial alignment involves sealing between the abutted ends"} +{"output_text": "\nThe advent of CT scanning has enabled medical diagnosticians to obtain a very high degree of accuracy in the diagnosis of a wide variety of bodily ailments. However, the high cost of the equipment involved, and the relatively high radiation dosage to which the patient is subjected, has limited the use of CT scanning to a relatively small number of medical facilities.\nIn an effort to reduce the cost of CT scanning, and to reduce the radiation dosage to which the patient is subjected, a number of alternative", "input_text": " A conductive material is located within the depressions and then preferably planarized to remove excess from the top surface of the dielectric layer and to provide a flat overall surface. A trace is patterned on the dielectric layer and the conductive material. A polyimide layer is then preferably patterned over the entire surface. The substrate is then removed by any suitable process.\nThe foregoing and other objectives, features, and advantages of the invention will be more readily understood upon consideration of the following detailed description of the invention, taken in conjunction with the accompanying drawings. This invention relates generally to non-destructive testing, relates more specifically to medical diagnostic apparatus and methodology; and yet more specifically, relates to X-ray scanning apparatus and methodology of the type associated with computed tomography.\nWithin recent years much interest has been evidenced on the part of medical diagnosticians in the field now widely known as \"computed tomography\" sometimes referred to hereinafter as \"CT\". In a typical procedure an X-ray source and detector apparatus are positioned on opposite sides of the portion of the patient which is intended for examination. In early prior art these paired elements are made to transit across the body portion to be examined while the detectors measure the X-ray absorption at the plurality of transmission paths defined during the transit process. Periodically as well, the paired source and detector means are rotated to a different angular orientation about the body and the transit process repeated. A very high number of absorption values may be yielded by procedures of this type, and the relatively massive amounts of data thus accumulated are processed by a digital computer which correlates the absorption values, to thereby derive absorption values for a very high number of points (typically in the thousands) within the section of the body being scanned.\nThis point-by-point data can then be combined to enable reconstruction of a matrix (visual or otherwise), which constitutes an accurate depiction of the density function of the bodily section examined."} +{"output_text": ", the arrangement of the refrigerator 30 is such that the heat conductor 65 adjacent to the refrigerator 30 is disposed substantially horizontally as to reliquefy the helium gas evaporated in the helium chamber 2. Therefore, the heat conductor 65 is disposed in the vicinity of the helium chamber 2, causing the heat conductor 65 to be heated by the helium gas. As a result, the heat conductor 65 is thermally expanded, causing the fastening force between the heat conductor 65 and the indium wire 66 to be reduced", "input_text": "\" ring 71. The fastening force of the bolt 69 plastically deforms the indium wire 66 so that the thermal connection is established between the heat conductor 64 adjacent to the cylinder 51 for fastening the refrigerator 30 and the heat conductor 65 adjacent to the refrigerator 30.\nExcessive fastening force of the bolt 69 and the displacements of the elements due to the thermal contraction and vibrations are absorbed by the belleville spring 70 so that the breakage of the elements and the defective thermal connection can be prevented. Further, even if the three-stage regenerative refrigerator 30 is contracted after it has been fastened to the cylinder 51 for fastening the refrigerator 30 and cooled sufficiently, further tightening of the bolt 69 enables a desired fastening force to be maintained.\nFurther, the tapered surface of the heat conductor 65 adjacent to the refrigerator 30 is knurled so that the fastening force between the indium wire 66 and the knurled surface is enlarged. Therefore, when the three-stage regenerative refrigerator 30 is removed, the removal can be performed such that the indium wire 66, which has been deformed plastically, adheres to the tapered surface of the heat conductor 65 adjacent to the refrigerator 30.\nSince the conventional superconductive magnet has the arrangement that the refrigerator 9 is disposed vertically in the axial direction of the magnet as described above, the position of the magnet device cannot be lowered and, accordingly, the overall size cannot be reduced.\nSince the superconductive magnet disclosed by the applicant of the present invention has the arrangement that the three-stage regenerative refrigerator 30 is disposed substantially horizontally as to reliquefy the helium gas evaporated in the helium chamber 2, the height of the apparatus can be lowered and therefore the overall size of the apparatus can be reduced. Further, the distance for which the displacer reciprocates can be maintained, causing the refrigerating performance to be improved. However"} +{"output_text": " is located at the bottom of the nozzle. The nozzle top is crimped onto the barrel by a crimping tool which is inserted into the barrel and crimps the nozzle top onto the barrel. The crimping tool is inserted into the barrel and crimps the nozzle top onto the barrel by applying a downward force on the crimping tool. The crimping tool is inserted into the barrel and crimps the nozzle top onto the barrel by applying a downward force on the crimping tool. The crimping", "input_text": ", are well known in the industry. Such cylindrical cartridges have nozzles at their top end and, after opening, such cartridges are inserted into guns wherein a plunger is advanced therein by action of squeezing a trigger, causing material in the cartridge to flow out through the nozzle to the area where it is to be applied. Some cartridges have a foil seal under the nozzle against which seal the material can be positioned. Some materials, if exposed to air, will harden, so that by providing such a seal, air contact with the materials before the cartridge is opened is minimized. To open a cartridge having a foil seal, one must first snip off the tip of the plastic nozzle and then insert an object down the nozzle to puncture the foil seal located at the bottom of the nozzle to allow the passage of the filler material out of the cartridge. It is sometimes difficult to locate a narrow enough instrument to insert down the open nozzle tip to puncture the foil seal. Further, if one snips off the nozzle tip to leave a small diameter opening to achieve a fine application bead of material and one does not have an instrument narrow enough to pass down through the opening in the nozzle to puncture the foil seal, one can undesirably stretch the nozzle tip by using a larger object, making it difficult to apply a narrow bead of material as the now-wider opening in the nozzle tip will allow a wider-than-desired bead of material to pass out the nozzle.\nCartridges filled with a variety of filler materials are commonly sold. The tops of such cartridges including their nozzles are formed of plastic. The top is spun within the barrel to effect a heat seal with the sides of the barrel. Nozzle tops are also made in two parts wherein plastic nozzle top is press fit into a metal crimpable end cap, forming a nozzle top which can be crimped onto a paperboard barrel. The foil seal"} +{"output_text": "-13 of the Notch (Artavanis-Tsakonas et al., Science 268, 225-232, 1995).\nIn the present invention, the Notch ligand is a protein which is homologous to the Notch ligand of Drosophila, and the binding region of the Notch ligand is a repeated amino acid sequence No. 11-13 of the Notch ligand.\nThe present invention relates to a method for producing a recombinant protein, and more particularly to a method for producing a recombinant protein by using a", "input_text": " undifferentiated cells in each tissue can be applied in various ways by referring to the known reference (Katsutoshi Yoshizato, Regenerationxe2x80x94a mechanism of regeneration, 1996, Yodosha Pub1. Co.).\nNotch is a receptor type membrane protein which is involved in regulation of nerve cells differentiation found in Drosophia. Homologues of the Notch are found in various animal kinds exceeding to the invertebrate and vertebrate including nematoda (Lin-12). Xenopus laevis (Xotch), mouse (Motch) or human (TAN-1). Ligand of the Notch in Drosophila are known. These are Drosophila Delta (Delta) and Drosophila Serrate (Serrate). Notch ligand homologues are found in various animal kinds as similar to the Notch of receptors (Artavanis-Tsakonas et al., Science 268, 225-232, 1995).\nHuman Notch homologue, TAN-1 is found widely in the tissues in vivo (Ellisen et al., Cell 66, 649-661, 1991). Two Notch analogous molecules other than TAN-1 are reported (Artavanis-Tsakonas et al., Science 268, 225-232, 1995). Expression of TAN-1 was also observed in CD34 positive cells in blood cells by PCR (Polymerase Chain Reaction) (Milner et al., Blood 83, 2057-2062, 1994). However, in relation to humans, gene cloning of human Delta and human Serrate, which are thought to be the Notch ligand, have not been reported.\nIn Drosophila Notch, binding with the ligand was studied and investigated in detail, and it was found that the Notch can be bound to the ligand with Ca++ at the binding region, which is a repeated amino acid sequence No. 11"} +{"output_text": " the characteristics of the cells, but also relative information concerning the characteristics of the cells. The relative information is used to determine which cell is the best candidate for the MS to lock on to.\nThe MS 120 then locks on to the best cell, and the control and processing unit 130 controls the voice and control channel transceiver 150 to transmit and receive voice and control information over the selected cell. The MS 120 may also receive information from the PSTN through the MSC 140.\nThe MS 120 may", "input_text": " ##STR12## ______________________________________ This invention relates generally to a method and apparatus for detecting a received signal in a communication system. More particularly, this invention relates to a method and apparatus for detecting locations of path rays in a multi-path receiver having multiple time references.\nFIG. 1 is a block diagram of an exemplary cellular radiotelephone system, including an exemplary base station (BS) 110 and a mobile station (MS) 120. Although denoted a \u201cmobile station\u201d, the station 120 may also be another type of remote station, e.g., a fixed cellular station. The BS includes a control and processing unit 130 which is connected to a mobile switching center (MSC) 140 which in turn is connected to a PSTN (not shown). General aspects of such cellular radiotelephone systems are known in the art. The BS 110 handles a plurality of voice channels through a voice channel transceiver 150, which is controlled by the control and processing unit 130. Also, each BS includes a control channel transceiver 160, which may be capable of handling more than one control channel. The control channel transceiver 160 is controlled by the control and processing unit 130. The control channel transceiver 160 broadcasts control information over the control channel of the BS or cell to mobiles locked to that control channel. It will be understood that the transceivers 150 and 160 can be implemented as a single device, like the voice and control transceiver 170, for use with control and traffic channels that share the same radio carrier.\nThe MS 120 receives the information broadcast on a control channel at its voice and control channel transceiver 170. Then, the processing unit 180 evaluates the received control channel information, which includes the characteristics of cells that are candidates for the MS to lock on to, and determines on which cell the MS should lock. Advantageously, the received control channel information not only includes absolute information concerning"} +{"output_text": "x80x9cseenxe2x80x9d by each interferometer is then the sum of the optical path-lengths of the two measurement paths. This approach is described in U.S. Pat. No. 5,867,276, entitled xe2x80x9cOptical Path Length Measurement System and Methodxe2x80x9d, issued to K. A. Nugent et al. on Feb. 9, 1999, and assigned to", "input_text": " in local density of the air in the measurement space. Such air density variations can result from a number of factors, including local temperature variations and air movement. Since the refractive index of the air through which the optical signal passes varies slightly with the density of the air, such turbulence can cause small errors in the distance measurements, as the distance measurement is a function of the wavelength of the optical signal and the refractive index of the air.\nExisting high quality single-wavelength interferometers can measure an optical path-length, for example a path-length used in lithography as a measure of stage position, with a theoretical precision on the order of 1 nm or better. However, turbulence of the air in the interferometer optical signal path typically contributes variations of 10-30 nm to the measured path-length during the typical time period in which an integrated-circuit wafer is exposed.\nSince such single-wavelength interferometers cannot distinguish between path-length changes due to this air turbulence and those due to stage motion, air turbulence has the effect of degrading the precision of these interferometers to a point where they are marginally capable of supporting 0.25 xcexcm-design-rule lithography. Hence, 0.1 xcexcm-design-rule lithography and below, which are becoming increasingly important in the industry, present significant challenges to the accuracy and precision of single-wavelength interferometers. As a result, under typical wafer production conditions, the overlay precision of single-wavelength interferometers is limited by air turbulence to approximately 10-30 nm, which is an unacceptably large imprecision for 0.1 xcexcm-design-rule lithography.\nOne solution which has been proposed to overcome the air turbulence problem is for two interferometers employing light beams having significantly different wavelengths (or frequencies) to share a common measurement path. The optical path-length of the measurement path xe2"} +{"output_text": "ed EAP methods to provide a more flexible policy-based access control. For example, a supplicant may be configured to use a particular EAP method, but the AAA server may allow the supplicant to use a different EAP method. In this case, the AAA server may send a TLV to the supplicant that indicates the EAP method that the supplicant is allowed to use. The supplicant may then use the indicated EAP method.\nHowever,", "input_text": " router that intercepts requests of the supplicant; the access router has the role of a client with respect to the AAA server.\nEAP supplicants and servers are typically used in a relatively simple operational configuration in which one encrypted outer EAP method protects messages communicated using one inner EAP method. For example, the outer method may be EAP-PEAP and the inside method may comprise EAP GTC, in which the user uses a cryptographic token card to supply a user credential, or the inside method may comprise Microsoft Challenge-Authentication Protocol (MS-CHAP).\nHowever, EAP sequences are becoming more widely deployed inside tunneled EAP methods, such as EAP-PEAPv2 and EAP-FAST, to provide authentication using more than one authentication factor, user authorization, validating that the supplicant has a required software configuration (\u201cposture validation\u201d), and other processes. These approaches introduce a greater burden on AAA servers, as these new methods require multiple cycles of challenge and response messages, and each message may require breaking into small fragments because they exceed the maximum transportable unit size of the transport medium (e.g., a WLAN) or transport protocol. Therefore, it is desirable not to burden an AAA server with a user authentication transaction unless it is relatively certain that the supplicant can perform as required in the transaction.\nFurther, new AAA server features may include more flexible policy-based access control. For example, authentication protocols no longer need to be statically pre-programmed into devices and supplicants. However, the whole authentication message sequence can fail at the last stage if the supplicant is not configured correctly. For this further reason, it is desirable not to initiate a message sequence if the supplicant cannot complete the sequence.\nType-length-value triplets (TLVs) are now used inside tunnel"} +{"output_text": " wafer or substrate along the partial cut or cuts.\nIn the case of mechanical singulation, the wafer or substrate is typically mounted on a chuck or other support structure, and the wafer or substrate is then sawed using a diamond-coated saw blade. The saw blade is typically mounted on a spindle which is rotated by a motor. The saw blade is moved across the surface of the wafer or substrate to make a cut. The saw blade is moved across the surface of the wafer or substrate by a", "input_text": " of a signal on a signal path comprises inputting the signal to a pull-up stage, where an output signal of the pull-up stage is operatively coupled to the signal path after a high to low transition on the signal; and inputting the signal to a pull-down stage, where an output signal of the pull-down stage is operatively coupled to the signal path after a low to high transition on the signal, and where the pull-up stage outputs an accelerated low to high transition when the signal begins to transition from low to high.\nAccording to another aspect, a method for accelerating a transition of a signal on a signal path comprises activating a pull-up stage in response to a high to low transition on the signal, detecting a beginning of a low to high transition on the signal, and accelerating the low to high transition on the signal when the beginning of the low to high transition is detected.\nOther aspects and advantages of the invention will be apparent from the following description and the appended claims. Electronic devices are typically manufactured by producing multiple copies of the same device on a substrate or workpiece. In particular, semiconductor devices are manufactured on substrates referred to as wafers, which are thin disks of materials such as silicon, gallium arsenide or sapphire or other materials which are capable of supporting the various processes that create semiconductor devices. These devices at some point in the manufacturing process need to be separated into individual devices for subsequent packaging and use. This separation into individual devices is referred to as \u201csingulation\u201d. Singulation can be performed mechanically, using diamond-coated saw blades, chemically, by masking and etching, photonically by directing laser energy at the wafer or substrate, or combinations of these methods. Singulation can be accomplished by cutting completely through the wafer or substrate, or by making a partial cut or cuts into one or more surfaces of the wafer or substrate and then mechanically cleaving the"} +{"output_text": " of the coolant fluid.\nThis object is achieved by an arrangement for controlling the coolant fluid in a conventional compressor, wherein the compressor has a compressor housing, a compressor wheel rotatably mounted in the compressor housing, a drive shaft connected to the compressor wheel, a drive motor for driving the drive shaft, a coolant fluid conduit for supplying the coolant fluid to the compressor wheel, a bypass conduit for bypassing the coolant fluid conduit, a bypass valve for controlling the bypass conduit,", "input_text": " processor depends on various parameters. Hence this solution is extremely elaborate to implement, both because multiple parameters must be monitored and evaluated and because an additional bypass conduit must be provided.\nThe solutions discussed above are predominantly concerned with the problem of keeping the coolant fluid in the compressor itself at a temperature such that water does not condense out and hence impairment of the coolant fluid and of the compressor is prevented. At the same time, the forms of regulation here disclosed are designed so as also to avoid raising the coolant fluid to a temperature high enough to be potentially damaging. However, the problems associated with the condensation of water while it is in the pneumatic consumer devices or in the conduits leading thereto are not addressed.\nA variant of a solution relevant to this point is known from the patent DE 36 01 816 A1. There the compressed process fluid, which has been heated to about 60xc2x0 C. above the intake temperature of the compressor, is passed through an overdimensioned after cooler to bring it down to a temperature about 10xc2x0 C. above the intake temperature. A considerable proportion of the water vapor present in the process fluid is thereby caused to condense out and is eliminated by a condensate trap. The compressed process fluid is subsequently sent to a heat exchanger where it is rewarmed so that ultimatelyxe2x80x94influenced to some degree by the current ambient parameters, which in this design are assumed to be unchangingxe2x80x94a process fluid is produced that is quite dry and about 60xc2x0 C. above the intake temperature, i.e. very hot.\nIt is an object of the present invention to provide an arrangement for controlling the coolant fluid in a conventional compressor which has a simple, economical and reliable construction and wherein it is possible to reduce or, where possible, avoid the condensation of water out"} +{"output_text": " MN and the CN without passing through the Home Agent (HA). In the bidirectional tunnel mode, data packets are transferred between the MN and the CN through the HA.\nIn the bidirectional tunnel mode, the MN and the CN are connected to each other through a tunnel. The tunnel is established by the HA. The tunnel is established by the HA in a manner that the MN and the CN are connected to each other through a tunnel. The tunnel is established by the HA in a manner that", "input_text": " valve closes when the ambient pressure around the valve drops below a predetermined level), a pressure-differential type (e.g. the valve automatically closes when there is an abnormal increase of pressure through the valve) or injection safety valves. (See Composite Catalogue of Oil Field Equipment & Services 1974-75 pages 3995, 4008 to 4011, and 4014). However, such valves only close in response to the predetermined condition and cannot be controlled to open or close from the surface.\nInjection safety valves do respond quickly to close the tubing string whenever a backkick occurs. However, present injection safety valves have no means to maintain the valve in an open position if it is desired to have a high back flow rate through the valve because they are urged towards a close position by such back flow. In addition they may have biasing means to constantly urge the valve member to a closed position.\nIt is sometimes desirable to inject fluids in a well equiped with a subsurface safety valve. In doing so, it is desirable that a check valve be present down in the tubing to protect personnel and equipment at the well. Equipment has not been available for this purpose without running additional equipment in to the well. In the mobile IPv6 data transfer technology, every Mobile Node (MN) has a fixed Home Address (HoA), which is independent of the current location by which the MN accesses the Internet and is directly used in a home link of the MN. When the MN moves outside the home link, the current location information of the MN is provided over a Care of Address (CoA) acquired from a Foreign Agent (FA).\nA Communication Node (CN) is a communication opposite end of the MN. A bidirectional tunnel mode and a route optimization mode may be used for transferring data packets between the MN and the CN.\nIn the route optimization mode, data packets are directly transferred between the"} +{"output_text": " logic 108.\nFIG. 2 illustrates a conventional scan architecture 200, which includes a scan path circuit 204, logic circuitry to be tested 208, and connection paths 212-220 to a tester 210. Tester 210 operates to: (1) output control to operate scan path circuit 204 via control path 214; (2) output serial test stimulus patterns to scan path circuit 204 via scan input path 218; (3) input serial test response patterns from scan path circuit 204 via scan output path 220;", "input_text": " FIG. 1. Scan architectures can be applied at various circuit levels. For example, the scan architecture of FIG. 1 may represent the testing of a complete IC, or it may represent the testing of an embedded intellectual property core sub-circuit within an IC, such as a DSP or CPU core sub-circuit. The scan architecture includes a scan path circuit 104, logic circuitry to be tested 108, and connection paths 112-120 to a tester 110. Tester 110 operates to: (1) output control to operate scan path 104 via control path 114; (2) output serial test stimulus patterns to scan path 104 via scan input path 118; (3) input serial test response patterns from scan path 104 via scan output path 120; (4) output parallel test stimulus patterns to logic 108 via primary input path 112; and (5) input parallel test response patterns from logic 108 via primary output path 116. Scan path 104 operates, in addition to its scan input and scan output modes to tester 110, to output parallel test stimulus patterns to logic 108 via path 122, and input parallel response patterns from logic 108 via path 124.\nTypically tester 110 is interfaced to the scan architecture by probing the die pads at wafer level, or by contacting package pins after the die is assembled into a package. While tester 110 connections to the primary inputs 112 and primary outputs 116 of logic 108 are shown, the primary input and output connections could be achieved by augmentation of scan path 104. For example, scan path 104 could be lengthened to include boundary scan cells located on each primary input and primary output of logic 108. The boundary scan cells would provide primary inputs to and primary outputs from logic 108, via widened stimulus and response busses 122 and 124, respectively. In some instances, logic 108 may be sufficiently tested by scan path 104 such that it is not necessary to provide primary inputs to and outputs from"} +{"output_text": " hydroxy;\nR10 is selected from the group consisting of hydrogen, alkyl, haloalkyl, alkoxy, amino, alkylamino, dialkylamino, cycloamino, alkylcarbonylamino, guanidino, carboxy, alkoxycarbonyl, and tetrazole;\nR13 is selected from the group consisting of hydrogen, alkyl, haloalkyl, alkoxy, amino, alkylamino, dialkylamino, cycloamino, alkylcarbonylamino, guanidino, carboxy", "input_text": "-2xe2x80x94, xe2x80x94SO2xe2x80x94NR1xe2x80x94, xe2x80x94NR1xe2x80x94SO2xe2x80x94, xe2x80x94C(xe2x95x90O)xe2x80x94CHRXxe2x80x94, xe2x80x94CHRXxe2x80x94C(xe2x95x90O), and cycloalkylene;\neach RX is independently selected from the group consisting of hydrogen, hydroxy, alkyl, haloalkyl, aminoalkyl, guanidinoalkyl, alkoxy, amino, alkylamino, dialkylamino, cycloamino, alkylcarbonylamino, guanidino, carboxy, alkoxycarbonyl, and tetrazole;\neach RY is independently selected from the group consisting of hydrogen, alkyl, haloalkyl, carboxy, and alkoxycarbonyl;\neach R1 is independently selected from the group consisting of hydrogen and lower alkyl;\nR2 is selected from the group consisting of hydrogen, halo and hydroxy;\nR5 is selected from the group consisting of hydrogen, halo, alkyl, haloalkyl, alkoxy, amino, alkylcarbonylamino, alkylsulfonylamino, benzenesulfonylamino, toluenesulfonylamino, carboxy, aminocarbonyl, alkylaminocarbonyl, dialkylaminocarbonyl, cycloaminocarbonyl, and alkoxycarbonyl;\nR6 is selected from the group consisting of hydrogen, halo, and hydroxy;\nR8, R9, R11, R12, R14, R15, R17, and R18 are independently selected from the group consisting of hydrogen, halo, alkyl, haloalkyl, methoxy, and"} +{"output_text": " spirit and scope of the invention will be apparent to those of skill in the art from the detailed description.\nAccording to a first aspect of the present invention, there is provided an image coding method for making the most of the features of MPEG4 principally executing the coding in object units, i.e., separates a desired object (an image of a specific area) in a natural image from another region, and performing the coding for an image signal having no shape information, which is obtained by taking", "input_text": " structure of a coded signal (pixel value signal) VSstr having shape information and pixel information. In this coded signal VSstr, the shape-related data Smb and pixel-related data Vmb are alternately arranged subsequent to VOP headers VSvoph, for each macroblock.\nThe coding of the pixel-related data is omitted for a target macroblock which has been judged to be an outside-object macroblock on the basis of the shape-related data.\nHowever, it is not easy to extract shape information of an object (an image of a specific area) included in a natural image, from an image signal obtained by taking a picture of the natural image by a camera.\nPractically, as a method for extracting the shape information, there is almost only a method for taking an image in an equipped environment such as a studio, subjecting an image signal which is obtained by this image-taking to the chromakey processing, and extracting the shape of the specific area in the image. In this shape information extraction method, it is difficult to realize a coding apparatus which makes the most of the features of MPEG4 principally executing the coding in object units.\nIt is an object of the present invention to provide an image coding method for making the most of the features of MPEG4 principally executing the coding in object units, i.e., separates a desired object (an image of a specific area) in a natural image from another region, and performing the coding for an image signal having no shape information, which is obtained by taking a picture of the natural image, and an image coding apparatus utilizing this image coding method, and a data storage medium which contains a program for realizing the image coding method by software.\nOther objects and advantages of the present invention will become apparent from the detailed description and specific embodiments described are provided only for illustration since various additions and modifications within the"} +{"output_text": " liner sheet from the substrate.\nIn another exemplary embodiment of the invention, a carrier construction is provided which includes a base sheet or substrate covered over the entirety of one side thereof with a low tack adhesive. A paper liner sheet is mounted over the adhesive side of the substrate, with a transverse slit provided near one end to permit the liner to be peeled from the substrate when ready for use. A transparent film or cover sheet is secured at this same end of the construction so that the transparent cover sheet can", "input_text": " H NMR. This invention relates generally to a carrier form which is designed to permit a number of smaller documents to be processed simultaneously through micrographics equipment to produce a microfiche film of the documents.\nCarrier assemblies for displaying and/or storing smaller documents such as microfiche film, are well known. For example, see U.S. Pat. No. 4,156,978; Canadian Patent No. 1,177,355; and French Patent No. 2,229,082.\nOther carrier constructions are known wherein documents, such as photographs, are adhesively secured to a substrate and covered by a transparent cover film or sheet. Examples of such carrier constructions may be found in U.S. Pat. Nos. 4,077,830 3,857,192; 3,736,685; 3,581,423 and 3,283,434.\nThe present invention provides a unique carrier construction wherein a number of smaller documents to be microfilmed may be adhesively mounted on a substrate and protected by a transparent cover sheet or film. In the carrier construction in accordance with this invention, a protective, removable liner sheet is interposed between the adhesive substrate and the transparent cover sheet. This removable liner sheet is preferably pre-printed and thereafter may be used as a record sheet upon its removal from the substrate as described further herein.\nAccordingly, in one exemplary embodiment of the invention, a carrier construction is provided which includes a base sheet or substrate covered over the entirety of one side thereof with a low tack adhesive. A paper liner sheet is mounted over the adhesive side of the substrate, with a transverse slit provided near one end to permit the liner to be peeled from the substrate when ready for use. A transparent film or cover sheet is secured at this same end of the construction so that the transparent cover sheet can be rolled back away from the liner to permit removal of the"} +{"output_text": " upper backwater chamber, and a plurality of hydraulic lines arranged to connect the upper backwater chamber to the lower backwater chamber. The apparatus also includes a plurality of inlet openings arranged to receive the first backwater from the upper backwater chamber and to direct the first backwater to the lower backwater chamber. The apparatus further includes a plurality of outlet openings arranged to receive the first backwater from the lower backwater chamber and to direct the first backwater to the upper backwater chamber.\nThe", "input_text": " In this connection, vertical flows (waterfalls) may form. They have the disadvantage that they entrain and incorporate parts of the surrounding air.\nIt is not always simple to manage with the space available for this purpose in a paper machine because, due to the large amount of water, it is necessary to keep the flow cross sections for such backwater flows wide and favorable to flowing. Therefore, enough space is seldom available for removing the backwater without vertical flows.\nThe present invention provides a device with which it is possible to bring together parts of the backwater from various geodetic heights in a space-saving manner and with as little entrainment of air as possible. Fluctuations in the amount of water should be problem-free as well.\nIn particular, the present invention provides a device similar to that discussed above, but in which the upper backwater chamber is connected to the lower backwater chamber via several hydraulic lines having inlet openings inside the upper backwater chamber. Further, these inlet openings are arranged at different geodetic heights relative to one another.\nWith the aid of the device according to the invention, the differences in height can be overcome with extremely little entrainment of air. The predominant part of the backwater can flow off downwardly through lines that are completely filled with water. The inlet openings lie at different heights such that, in general, just one single line will be only partially filled with backwater, whereas the inlet openings of the others are located either completely below or above the water level. The number of hydraulic lines with water flowing through them will differ, depending on the water level in the upper backwater chamber.\nThe present invention is directed to an apparatus for guiding portions of backwater produced or stored at different levels of a machine. The apparatus includes an upper backwater chamber arranged to receive first backwater, a lower backwater chamber arranged below the"} +{"output_text": " in the hash table.\nIn the example of FIG. 1, the hash table 4 is logically ordered according to social security number. As illustrated in FIG. 1, the hash table 4 is logically ordered according to social security number. As further illustrated in FIG. 1, the hash table 4 is logically ordered according to social security number. As further illustrated in FIG. 1, the hash table 4 is logically ordered according to social security number. As further illustrated in FIG. 1, the hash table", "input_text": " more particularly to a system for improving access to nearest logical records in logically ordered data contained in a non-logically ordered hash table.\n2. Background of the Invention\nTraditional linear hash tables optimize access time by evenly distributing records across the underlying table. In the process of entering records into the hash table, any logical ordering of the data is lost. While access time for a specific record given a specific key is fast, the ability to xe2x80x9cwalkxe2x80x9d to adjacent logically ordered (not physically ordered in the hash table) is lost. Further, in the absence of a record existing in the hash table for a given key, due to the logical ordering of the data being lost on entry into the hash table, it is impossible to perform a time optimal xe2x80x9cfind nearestxe2x80x9d, xe2x80x9cfind nearest precedingxe2x80x9d, or xe2x80x9cfind nearest proceedingxe2x80x9d type of query.\nReferring now to FIG. 1, there is illustrated an example of the use of hashing to store and retrieve logically ordered data. In this example, employee records 1 include an employee name 2, and a social security number 3 used as key k for a hash function F(k) used to map employee records 1 to a hash table 4 with an Index 5 space of 2000 (0-1999). Records 1 are logically ordered according to social security number. As illustrated in FIG. 1, records 1 are mapped, in this example, with hash function F(k) (which may be any arbitrary function), as follows:\nAs further illustrated in FIG. 1 and Table 1 above, each record in the hash table is threaded by the inclusion of pointers 6 to the next succeeding and next preceding logically ordered record"} +{"output_text": " the carrier is removed. The insulator is then removed from the conductive patterns to form a plurality of conductor patterns.\nU.S. Pat. No. 4,926,550 discloses a method of forming a multilayer ceramic substrate. A plurality of ceramic green sheets are stacked and laminated together. Conductive patterns are formed on a carrier. The conductive patterns are then completely blanketed by an insulator and the carrier is removed. The insulator is then removed from the conductive patterns to form a plurality", "input_text": " a connector screen for interconnecting adjacent surfaces of boards or modules is disclosed. The connector screen comprises of conducting connector elements that are separated by a web of nonconducting material.\nA connector assembly for a circuit board testing machine is disclosed in U.S. Pat. No. 4,707,657. An electrically insulating material having circuit tracks of an electrically conductive material is arranged on opposite side surfaces. The test points are electrically insulated from each other.\nA process to form Multilayered Ceramic (MLC) Substrates, having solid metal conductors, is taught in U.S. Pat. No. 4,753,694. The MLC substrate involves, forming a pattern of solid, nonporous conductors to a backing sheet having a release layer, then transferring the pattern to a ceramic green sheet.\nU.S. Pat. No. 4,926,549, discloses a method of producing electrical connection members. A carrier is formed on a first electrically conductive member, holes are etched in portions of the carrier to expose the first electrically conductive member and to form recesses therein. The recesses have a diameter larger than the diameter of the corresponding hole. The respective holes formed in the carrier are filled with a second electrically conductive material, and subsequently, the first electrically conductive member is removed from the carrier, thereby, leaving a carrier having a plurality of an electrically conductive material protruding out of the upper and lower surfaces of the carrier. The carrier having the plurality of electrical conducting protrusion can then be used to connect a semiconductor device to a circuit board.\nIBM Technical Disclosure Bulletin, Vol. 27, No. 3, pp. 1404-1405 (August 1984) discloses a process for transferring thin-film conductor patterns to a multilayer ceramic substrate. Conductive patterns are formed on a carrier. The conductive patterns are then completely blanketed by an insulator and"} +{"output_text": " and has the disadvantage of being unable to sufficiently improve the capacity of the alkaline secondary battery.\nIn order to solve these problems, a method of using a non-woven fabric made of a polyolefin fiber and a polyamide fiber in combination has been proposed (for example, refer to Patent Document 1). However, since the polyamide fiber is hydrophobic, it is resistant to wetting by electrolyte and has a low electrolyte retention volume. Consequently, this non-woven fabric has high electrical", "input_text": " to 200 \u03bcm, and the capacity of alkaline secondary batteries was unable to be significantly improved.\nOn the other hand, a non-woven fabric using aliphatic polyamide fibers such as fibers made of Nylon 6 or Nylon 66 has come to be used as a non-woven fabric for alkaline battery separators that has superior hydrophilicity and liquid retention with respect to electrolyte and low electrical resistance when containing electrolyte. Alkaline secondary batteries using this aliphatic polyamide fiber non-woven fabric have superior alkaline resistance, high hydrophilicity and superior electrolyte retention, while also having the characteristic of superior discharge characteristics for large currents. However, this non-woven fabric lacks chemical stability, and has inferior heat resistance as represented with the glass transition temperature as well as inferior oxidation resistance at high temperatures in particular. Consequently, it has the disadvantage of being susceptible to oxidation and decomposition by oxygen gas generated during charging of the alkaline secondary battery, and causes a significant decrease in battery performance when the alkaline secondary battery is used under temperature conditions within the range of 60 to 80\u00b0 C. Thus, alkaline secondary batteries in which an aliphatic polyamide fiber non-woven fabric is used for the separator non-woven fabric demonstrate large self-discharge caused by decomposition of the non-woven fabric, and particularly in the case of alkaline secondary batteries that undergo repeated charging and discharging at high temperatures, the cycle life is shortened considerably.\nOn the other hand, polyolefin fiber non-woven fabric has been used in alkaline secondary batteries requiring heat resistance at comparative high temperatures. Although polyolefin fiber non-woven fabric has superior heat resistance, since it is hydrophobic, it is resistant to wetting by electrolyte and has a low electrolyte retention volume. Consequently, this non-woven fabric has high electrical resistance when used as the separator non-woven fabric of an alkaline secondary battery,"} +{"output_text": " side and the piezoelectric element circuit elements. The timing control circuit is configured to provide a timing input to the piezoelectric element circuit elements to control timing of breakdown and duration of energy delivery.\nIn another aspect, the invention is a method of controlling timing of breakdown and duration of energy delivery in an ignition system. The method includes providing a piezoelectric transformer having a drive side, an output side, and a piezoelectric element circuit elements in electronic communication with the output side that tune output impedance in series with a breakdown gap", "input_text": " respect to the particular combustion system, e.g., an IC engine system, etc.\nThe timing control may be a function of a predetermination or calibration of system performance. Exemplary input signals for the timing control also include tuned output circuit impedance, predetermined external system calibration, and parameters measured by external system sensors. With tuned electrical output impedance, the power flow across the spark gap is optimized such that, when combined with accurate timing inputs, controlled timing of breakdown and duration of energy delivery is made possible. Through precise timing inputs, TPDI output power is converted to regulated energy delivery.\nTypical ignition systems, for example, magnetically transformed and captive discharge, provide control over timing of breakdown voltage and limited quantities of post-breakdown energy. In contrast, TPDI provides a tool for controlling both absolute timing of breakdown and relative timing and quantity of the post-breakdown energy.\nTPDI can be also used as the sole ignition system for IC engines of all sizes. It can also be used as a starter system for IC engines, either as a simple parallel add-on or a means for reducing the size, weight and cost of the starter motor. TPDI has utility as an initiator for energetic materials commonly used in detonators and pyrotechnic actuators. The system may also be exploited in starters and ignition systems for general and commercial aviation industries. In addition, the system may be incorporated into pest killers, electrostatic discharge weapons (Taser), and safe and arm devices.\nIn one aspect, the invention is an ignition system. The ignition system includes a piezoelectric transformer having a drive side, an output side, and a piezoelectric element circuit elements in electronic communication with the output side that tune output impedance in series with a breakdown gap to optimize power flow from the transformer to the breakdown gap after breakdown, and a timing control circuit in electronic communication with the drive"} +{"output_text": " tinctorily effective amount of oxidative dye precursors; and\na hair conditioning effective amount of HPBISAPDC conditioning agent.\nAnother aspect of this invention comprises a two-part composition for oxidative dyeing of hair, the composition comprising:\na dye lotion formulation comprising:\nat least about 50% by weight water;\na tinctorily effective amount of oxidative dye precursors; and\na hair conditioning effective amount of HPBISAPDC conditioning agent; and\na developer composition", "input_text": " wet combing characteristics immediately after the dyeing mixture is washed from the hair.\nThe invention comprises a hair dyeing composition containing hydroxypropyl bisisostearamidopropyldimonium chloride (also referred to herein as HPBISAPDC) as the primary conditioning material in a two-part aqueous hair dyeing composition. HPBISAPDC is available as Schercoquat 21AP sold by Scher Chemicals, Inc. of Clifton, N.J., which contains 85% HPBISAPDC in a 15% propylene glycol diluent. The invention also comprises a two-part hair aqueous dyeing composition comprising a dye lotion formulation containing oxidizable dye precursors; at least about 50% by weight water, and a conditioning effective amount of HPBISAPDC conditioning agent.\nThis invention comprises a two-part system comprising aqueous, oxidative, hair coloring compositions (lotions and developers) for mixture with each other shortly before use. The lotion comprises an aqueous alkaline composition having a pH of from about 7 to 11 and a water content of at least about 50% by weight, a tinctorily effective amount of oxidative dye precursors, and a hair conditioning effective amount of HPBISAPDC conditioning agent. The second part, i.e., the developer, is an aqueous composition with a pH of from about 2 to about 6, preferably 2 to 3, containing a peroxide oxidizing agent.\nThe invention also comprises a kit or package of the developer and dye lotion formulations. A further aspect of this invention is the use of such two-part systems for the oxidative coloration of hair.\nOne aspect of this invention comprises a high aqueous-content dye lotion formulation for use in a two-part composition for oxidative dyeing of hair, the dye lotion formulation comprising:\nat least about 50% by weight water;\na"} +{"output_text": " number of anodes increases.\nThe paper spacer is a critical element in the design of an aluminum electrolytic capacitor. The paper spacer must be thin enough to allow the anode to be inserted into the paper spacer without damaging the paper spacer. The paper spacer must also be strong enough to withstand the forces exerted on the capacitor during the manufacturing process. The paper spacer must also be flexible enough to allow the capacitor to be wound into a cylindrical shape. The paper spacer must also be able to withstand the high", "input_text": " for the anode plates, other metals such as tantalum, magnesium, titanium, niobium, zirconium and zinc may be used. A typical solvent-based liquid electrolyte may be a mixture of a weak acid and a salt of a weak acid, preferably a salt of the weak acid employed, in a polyhydroxy alcohol solvent. The electrolytic or ion-producing component of the electrolyte is the salt that is dissolved in the solvent. The entire laminate is rolled up into the form of a substantially cylindrical body, or wound roll, that is held together with adhesive tape and is encased, with the aid of suitable insulation, in an aluminum tube or canister. Connections to the anode and the cathode are made via tabs. Alternative flat constructions for aluminum electrolytic capacitors are also known, comprising a planar, layered, stack structure of electrode materials with separators interposed therebetween, such as those disclosed in the above-mentioned U.S. Pat. No. 5,131,388.\nIn ICDs, as in other applications where space is a critical design element, it is desirable to use capacitors with the greatest possible capacitance per unit volume. Since the capacitance of an aluminum electrolytic capacitor is provided by the anodes, a clear strategy for increasing the energy density in the capacitor is to minimize the volume taken up by paper and cathode and maximize the number of anodes. A multiple anode stack configuration requires fewer cathodes and paper spacers than a single anode configuration and thus reduces the size of the device. A multiple anode stack consists of a number of units consisting of a cathode, a paper spacer, two or more anodes, a paper spacer and a cathode, with neighboring units sharing the cathode between them. Energy storage density can be increased by using a multiple anode stack configuration element; however, the drawback is that the equivalent series resistance, ESR, of the capacitor increases as the"} +{"output_text": " at which the magnetization vector of the material is stable). The magnetic coil is then energized to create a magnetic field that orients the magnetization vector of the material in a direction perpendicular to the plane of the disk. The optical spot is then moved to a new location on the disk, and the magnetic coil is de-energized. The optical spot is then focused on the disk to heat the magneto-optical material to a temperature below the Curie point. The magnetic coil is then energ", "input_text": ", a mobile PUF is limited by a maximum mobile transmission power. The PUF scheme has limitations in its effectiveness of estimating a mobile location if a terminal is positioned where the distance between the terminal and base stations is large or the terminal runs out of battery life. 1. Field of the Invention\nThe present invention relates generally to data storage systems having optical data tracking, storage or retrieval systems. More particularly, the present invention relates to data storage and/or retrieval systems include steerable optics.\n2. Background Art\nIn data recording and retrieval systems that use a moving media having a varying material characteristic, detectable variations from previously encoded media locations may be retrieved using reflected incident light. Such variations may also be used in providing servo control signals for following previously recorded data tracks. For example, in a magneto-optical storage system, using a Magneto-Optical (MO) recording material deposited on a rotating disk, information may be recorded on the disk as spatial variations of magnetic domains. During readout, the magnetic domain pattern modulates art optical polarization, and a detection system converts a resulting signal from optical to electronic format.\nIn one type of magneto-optical storage system, a magneto-optical head assembly is located on a linear actuator that moves the head along a radial direction of the disk to position the optical head assembly over data tracks during recording and readout. A magnetic coil is placed on a separate assembly on the head assembly to create a magnetic field that has a magnetic component in a direction perpendicular to the disk surface. A vertical magnetization vector of polarity (opposite to that of the surrounding magnetic material of the disk medium) is recorded as a mark indicating zero or a one by first focusing a beam of laser light to form an optical spot on the disk. The optical spot functions to heat the magneto-optical material to a temperature near or above a Curie point (i.e., a temperature"} +{"output_text": "l 46, 1-9). The authors suggested that the decrease in progestagens was due to the activation of the HPA axis by the foal's own cortisol. However, the authors did not report any evidence of HPA axis activation in the foals.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a high-speed transistor.\nIn recent years, a semiconductor device having a high-speed", "input_text": " been limited to questions of broad interest like national elections and the outcome of professional sporting events. This limitation reflects a short falling of conventional prediction markets, that their predictive capability is a direct function of their liquidity, or ultimately their number of active users. Neonatal maladjustment syndrome (NMS) is a common disorder of neonatal foals that manifests within the first 72 h of life (Bernard, et al (1995) In: Proceedings, 41st American Association of Equine Practitioners, Lexington, Ky. pp 222-224; Rossdale and Leadon, (1975) J Reprod Fertil 23, 658-661). The proposed mechanisms include hypoxic and ischaemic events prior to, during and shortly after parturition (Palmer and Rossdale, (1976) Res Vet Sci 20, 267-275). Affected foals exhibit neurological dysfunction such as seizures and altered states of consciousness, behaviour and response to stimuli (Bernard et al. 1995, supra; Ringger, et al. (2011) J Vet Intern Med 25, 132-137). However, hypoxic and ischaemic injury is not always identified upon histopathological evaluation, and long-term neurological deficits have been reportedly rare. Fetal corticosteroids, through activation of the hypothalamo-pituitary-adrenocortical (HPA) axis, contribute to the maturation of many organs and regulate the transition between intra- and extrauterine life (Rossdale, (2004) In: Proceedings, 51st American Association of Equine Practitioners, Denver, Colo. pp 75-126). Rossdale, et al., ((1995) Reprod Fertil Dev 7, 567-575) reported increased concentrations of progestagens in neonatal foals that rapidly decrease over the following 48 h after birth (Houghton, et al., (1991) J Reprod Fert Supp"} +{"output_text": ".\nThe pitch control rod assemblies are typically mounted to the main rotor shaft by means of a pitch control rod bearing assembly. The pitch control rod bearing assembly is typically a roller bearing assembly, which is a type of ball bearing assembly. The pitch control rod bearing assembly is typically mounted to the main rotor shaft by means of a pitch control rod bearing cap. The pitch control rod bearing cap is typically a metal cap that is welded to the main rotor shaft. The pitch control rod bearing cap is typically", "input_text": " at which an airfoil (i.e. a main rotor blade) passes through the air. This angle may be called the \"angle of attack\"; it is measured between the chord of the airfoil and the direction of the relative wind. By changing the collective pitch, an aviator adjusts the vertical lift and resultant altitude of the aircraft. Minimal collective pitch is used when hovering flight is desired at a given resultant altitude. Changes in collective pitch may be accompanied by automatic or manual adjustments in engine power and cyclic pitch so as to maintain normal engine rpm, to control rates of vertical climb and horizontal flight, to correct dissymmetry of lift in forward flight, and to avoid stalling.\nWhile the precise details of individual systems may vary in their particular components and design characteristics, main rotor blade assemblies typically include pitch control rod assemblies. The control rod assemblies constitute means for translating control impulses from the aviator's movement of the collective and cyclic pitch control levers in the cock pit of the aircraft. Servomechanical power is typically used to provide the mechanical force required for rotating the control rods and adjusting the pitch or angle of attach of each main rotor blade.\nThe main rotor blades must rotate at high speeds, with tip speeds often in excess of 400 miles per hour, to generate sufficient lift to elevate the aircraft off the ground and to the desired altitude. The blade assemblies, including the pitch control rod assemblies, are subjected to large loads even in routine operations. The loads include high frequency harmonic motion from vibrational resonance, lead-lag oscillations and flapping, turbulence from airflow patterns and Coriolis forces, resultant lift and induced drag.\nBecause of the loads on the main rotor shaft, blades, and components of their constituent assemblies, maintenance is often a challenge and durability and reliability are abiding concerns. This applies to the pitch control rod assemblies as well as to other components of the main rotor system"} +{"output_text": " sneeze guards, the rigid support frame or support posts are permanently affixed to the service counter or cart, and the pane of glass or plastic material is permanently affixed to the rigid support frame or support posts. In other sneeze guards, the rigid support frame or support posts are permanently affixed to the service counter or cart, and the pane of glass or plastic material is mounted on the rigid support frame or support posts by a mounting means, such as a threaded rod, which is", "input_text": " of the conducting polymer complexes that are advantageous for stretch, melt processing, or for compounding. PA1 (1) Aqueous and non-aqueous solutions of the conductive polymer complexes that are useful for spraying or casting thin films of conductive polymer. PA1 (2) Colloidal suspension of the conductive polymer complexes that are advantageous for coating, painting, printing or for compounding. PA1 (3) Solid state of the conducting polymer complexes that are advantageous for stretch, melt processing, or for compounding. Sneeze guards have been used for many years to protect unpackaged prepared food and beverages, when they are displayed in a service line for customer viewing and selection, from certain contaminants. Indeed, state and local laws and regulations require all such food to be shielded from droplet contamination which may be expelled during a cough or sneeze from the nose or mouth of a potential customer.\nAccordingly, sneeze guards are well known and widely used in the food service industry. Sneeze guards are customarily used in retail food service such as cafeterias, smorgasbords, salad bars and buffet lines, which provide a service line displaying food for a customer's selection. Sneeze guards must protect the displayed food in the zone of potential droplet contamination. The zone of potential droplet contamination is determined based upon the height and placement of the service line, and the average height, range of the potential customers.\nAlthough sneeze guards are available in several styles and configurations, typically a sneeze guard has either a rigid support frame, or two or more rigid and stationary support posts, and a mounted pane of glass or plastic material which provides the shield or barrier between the displayed food and the customers. Generally, the rigid support frame or support posts of the sneeze guard are permanently affixed to a stationary surface, such as a service counter or cart. In some"} +{"output_text": ", and a first interconnect 12 is formed on the first interlevel dielectric film 11. In FIG. 1B, a second interlevel dielectric film 13 is formed on the first interconnect 12, and a second interconnect 14 is formed on the second interlevel dielectric film 13. In FIG. 1C, a third interlevel dielectric film 15 is formed on the second interconnect 14, and a third interconnect 16 is formed on the third interlevel dielectric film 15. The air gap is formed between the first interconnect", "input_text": ". The present invention also relates to a method for manufacturing such a multilevel interconnection structure.\n(b) Description of the Related Art\nWith the advance of finer pattern and higher operational speed of transistor elements in a semiconductor device, the line width and the line space in the interconnect pattern have been reduced remarkably. The reduction of the thickness of the semiconductor device, however, is not noticed partly because such a reduction is limited in view that a smaller interconnect in the thickness has a larger line resistance. As a result, parasitic capacitance between interconnects, especially in the same layer, tends to increase. For example, a current semiconductor device having a 0.35 xcexcm design rule MOSFET has a line space between layers on the order of 1 xcexcm, and has a line space between lines in the same layer on the order of 0.5 xcexcm, which means that the parasitic capacitance between interconnects in the same layer is dominant compared to that between layers in the current semiconductor device. In a next generation semiconductor device, wherein a finer space will be achieved between interconnects in the same layer with the line thickness being maintained, it is likely that the most of the component of the parasitic capacitance is attributable from the adjacent interconnects in the same layer. In this case, the semiconductor device will not effectively function due to its lower operational speed.\nPatent Publication JP-A-7-326670 proposes a multilevel interconnection structure wherein an air void (air gap) is provided between adjacent lines in order to decrease the parasitic capacitance therebetween for improvement of the operational speed of the semiconductor device. Air has a lowest permittivity among known materials to thereby obtain a lower parasitic capacitance. FIGS. 1A to 1C show the process for fabrication of the multilevel interconnection structure having the air gap. In FIG. 1A, a first interlevel dielectric film 11 is formed on a semiconductor substrate 10"} +{"output_text": " a higher data rates. But, conventional cable modem and DSL infrastructure attached to the general Internet have far less tolerance for peak bandwidth requirements for compressed video. So, online services that host video games or applications in server centers a long distance from the client devices, and then stream the compressed video output over the Internet through conventional residential broadband connections suffer from significant latency and peak bandwidth limitations\u2014particularly with respect to games and applications which require very low latency (e.g., first person shooters and other multi-", "input_text": ", where the virtual camera is constantly moving around jerkily). Such video games can result in frame sequences with large and frequent peaks where the user may need to clearly see what is happening during those sudden motions. As such, compression artifacts are far less tolerable in 3D high action video games. Thus, the video output of many video games, by their nature, produces a compressed video stream with very high and frequent peaks.\nGiven that users of fast-action video games have little tolerance for high latency, and given all of the above causes of latency, to date there have been limitations to server-hosted video games that stream video on the Internet. Further, users of applications that require a high degree of interactivity suffer from similar limitations if the applications are hosted on the general Internet and stream video. Such services require a network configuration in which the hosting servers are set up directly in a head end (in the case of cable broadband) or the central office (in the case of Digital Subscriber Lines (DSL)), or within a LAN (or on a specially-graded private connection) in a commercial setting, so that the route and distance from the client device to the server is controlled to minimize latency and peaks can be accommodated without incurring latency. LANs (typically rated at 100 Mbps-1 Gbps) and leased lines with adequate bandwidth typically can support peak bandwidth requirements (e.g., 18 Mbps peak bandwidth is a small fraction of a 100 Mbps LAN capacity).\nPeak bandwidth requirements can also be accommodated by residential broadband infrastructure if special accommodations are made. For example, on a cable TV system, digital video traffic can be given dedicated bandwidth which can handle peaks, such as large I frames. And, on a DSL system, a higher speed DSL modem can be provisioned, allowing for high peaks, or a specially-graded connection can provisioned which can handle"} +{"output_text": " drilling opportunities could be opened up that are currently not economically feasible.\nThe industry does not have a system that can drill a 20 mile lateral wellbore in a single trip. The industry does not have a system that can drill a 20 mile lateral wellbore in a single trip that can be used in a single trip. The industry does not have a system that can drill a 20 mile lateral wellbore in a single trip that can be used in a single trip that can be used in", "input_text": " and its alloys, by forcing high temperature helium or other noble gases into the titanium during fabrication are also described.\nNumerous different embodiments of hydraulic seals are described for the Smart Shuttle, for the Subterranean Electric Drilling Machine, and for pipeline pigs, including novel cup seals and novel chevron seals.\nDifferent embodiments of hydraulic seals are described which incorporate measurement sensors, and in yet other embodiments, measurement information from the sensors is used for the closed-loop feedback control of the hydraulic seals.\n2. Description of the Related Art\nThe oil and gas industry does not now have the capability to drill horizontally extreme distances of approximately 20 miles to commercially meet some of the challenges that exist today. Industry extended reach-drilling capability is currently between 6 and 7 miles. Conventional drilling rigs using drill pipe and mud motors at shallow angles have established these conventional records. These wells have pushed conventional drilling technologies close to their practical limit and new methods are required for longer offsets.\nThe industry's lack of a 20 mile drilling capability reduces accessibility to oil and gas reserves. Many areas, both onshore and offshore, have no surface access for development drilling. Onshore, this may be due to urban development as is the case in Holland, national parks or other special areas such as the Arctic National Wildlife Refuge (ANWR), or other land uses that are sensitive to surface drilling operations. Offshore, the incentive is to maximize the use of existing structures and infrastructure by replacing expensive flowlines, manifold and trees. Near shore regions as found in the Santa Barbara Channel, and especially where ice may be present such as in the Arctic or near Sakhalin Island, or where migrating whales may limit seasonal operations provide significant incentives for this new 20 mile drilling capability.\nThe industry does not have an extreme reach lateral drilling system that is compatible with existing drilling and production infrastructure. If such a system were available, new"} +{"output_text": " tested, and debugged before it is ever used in a real system.\nThe present invention relates to a method for manufacturing a semiconductor device, and more particularly to a method for manufacturing a semiconductor device having a trench isolation structure.\nIn recent years, as the integration density of semiconductor devices increases, the isolation structure of the semiconductor device is becoming more and more important. In particular, as the integration density of semiconductor devices increases, the isolation structure of the semiconductor device is becoming more and more important.", "input_text": " to such foundries is the functional logic description required for the chip, and a set of photolithographic masks which are then used in the manufacture of the desired electrical circuit chip devices. The output of the foundry is a set of chips that are a physical manifestation of the logic description and masks provided to the foundry. However, it is noted that the construction of such masks and the initial production of circuit chips is expensive. Any passage of a given device having the prescribed logic functionality though such a foundry is an expensive and time consuming process which clearly should be undertaken only once. It is the purpose of emulation/acceleration engines to ensure such a single passage from the functional logic design stage through the stage of chip production via such a foundry.\nVerifying that logic designs are correct before committing a design to manufacturing, therefore, eliminates the need for costly and time-consuming multiple passes through a silicon foundry. Debugging logic errors deep inside a logic chip can be extremely difficult because of very limited observability. Emulation provides two very significant advantages. Firstly, the proper verification of a functional logic design eliminates the need for a second costly passage through the foundry, and, secondly, and just as importantly, getting the design \u201cright the first time\u201d means that the design does not have to be debugged using foundry produced parts having design errors. Accordingly, production delays are significantly reduced and the time to market for the particular technology/technology improvements embedded in the integrated circuit chip is greatly reduced, thus positively impacting the ability to deliver the most sophisticated technological solutions to consumers in as short of time as possible.\nAn additional advantage that emulation/acceleration systems have is that they act as a functioning system of electrical circuits which makes possible the early validation of software which is meant to operate the system that the emulator/accelerator is mimicking. Thus, software can be designed,"} +{"output_text": " the transducer effect.\nThe inventors have found that the transducer effect is not only important for the calibration of the earphone, but also for the calibration of the device. The inventors have found that the transducer effect is not only important for the calibration of the earphone, but also for the calibration of the device.\nThe inventors have found that the transducer effect is not only important for the calibration of the earphone, but also for the calibration of the device.\nThe inventors have found that the", "input_text": "Conventional \u201chearing tests\u201d require a calibrated device to measure the hearing threshold level of an individual in a quiet environment. The data set captured from such hearing tests thus represents the threshold values of an individual's hearing.\nIn order for a device to produce a specific sound wave amplitude as a test tone, the device and its transducer combination require calibration. The reason for this is that the circuitry and transducer or earphone of each device have different frequency responses that influence the output amplitude of sound waves. This means the same electric audio signal will result in different amplitudes of sound waves for devices and earphones of different models. Therefore, by calibrating the device and the earphone combination, specific sound wave amplitudes as test tones can be produced for hearing testing purposes.\nBy using a calibrated device, consistent sound wave amplitudes can be produced and the accurate value of hearing levels can then be obtained by finding the thresholds over a range of frequencies in a sound-proof or quiet environment, and producing a \u201chearing profile\u201d which will represent a subject's hearing threshold or audiogram. The inventors' work on this aspect has been published in The International Journal of Audiology Vol. 51, No. 8, p 606-610 (August 2012). The data thus captured within the device may then be used as parameters based upon which the signal processing engine can modify or enhance the audio signal.\nDuring development, experiments were carried out and results from individuals with \u201cnormal\u201d hearing and individuals with hearing loss over a range of frequencies revealed there are factors which are important and should be taken into account during audio signal enhancement. Among these factors are transducers in the earphones and environmental noise. Further investigation using various earphones which differ from the calibrated (standard) earphone confirmed the importance of the \u201ctransducer effect.\u201d Indeed, any electronic component, whether substituted for another component or added to/removed from a calibrated signal pathway may contribute to"} +{"output_text": " MEMS device is a device in which a mechanical structure is assembled in a semiconductor IC.\nThe MEMS device is a device in which a mechanical structure is assembled in a semiconductor IC. The MEMS device is a device in which a mechanical structure is assembled in a semiconductor IC. The MEMS device is a device in which a mechanical structure is assembled in a semiconductor IC. The MEMS device is a device in which a mechanical structure is assembled in a semiconductor IC. The MEMS device is", "input_text": " a predetermined number of (three) synapses shown in FIG. 1.\nThe parallel system shown in FIG. 1 is advantageous in that the processing speed is high, as described above. However, the number of lines and the number of synapses must be determined at the stage of designing the circuit, and it is very difficult to increase or reduce the number thereafter.\nIn a conventional fuzzy neuron device, it is therefore not easy to increase the number of synapses or the like by connecting a plurality of fuzzy neuron devices to each other. Although the extension is not impossible if another device (circuit) is inserted between the fuzzy neuron devices, it leads to complication of the system and a rise in cost. 1. Technical Field\nThe present invention relates generally to an apparatus for measuring the position of the mirror of a diffractive light modulator and performing positional compensation and a method of controlling the apparatus, and, more particularly, to an apparatus for measuring the position of the mirror of a diffractive light modulator and performing positional compensation, which measures the position of the mirror of a diffractive light modulator by measuring the capacitance of the mirror, the capacitance of a piezoelectric material layer or the intensity of output diffracted light and performs positional compensation, and a method of controlling the apparatus.\n2. Description of the Related Art\nWith the development of microtechnology, Micro-Electro-Mechanical Systems (MEMS) devices and small-sized equipment, into which MEMS devices are assembled, are attracting attention.\nA MEMS device is formed on a substrate, such as a silicon substrate or a glass substrate, in microstructure form, and is a device into which an actuator for outputting mechanical actuating force and a semiconductor Integrated Circuit (IC) for controlling the actuator are electrically or mechanically combined. The fundamental feature of such a MEMS device is that an actuator having a mechanical structure is assembled in part of a device. The"} +{"output_text": " the passage of radio frequencies through walls is not limited to frequencies in the radio frequency range. For example, the passage of radio frequencies through walls is not limited to frequencies in the radio frequency range. For example, the passage of radio frequencies through walls is not limited to frequencies in the radio frequency range. For example, the passage of radio frequencies through walls is not limited to frequencies in the radio frequency range. For example, the passage of radio frequencies through walls is not limited to frequencies in the radio frequency", "input_text": " optical layers (i.e., the PET and metallic layers) as being between 1-3 microns, and a hardcoat having a thickness of 3.0 microns will result in film composite having an emissivity of greater than 0.35. Additionally, in order to achieve this composite emissivity of the film, the visible light transmittance (VLT) was limited to about 50%.\nIn addition to managing IR radiation, there exists a need to control electromagnetic radiation. Electromagnetic radiation of various frequencies is radiated from many devices used in a wide range of facilities including homes, workplaces such as offices, manufacturing and military installations, ships, aircraft and other structures. Examples of such devices include computers, computer monitors, computer keyboards, radio equipment, communication devices, etc. If this radiation escapes from the facility, it can be intercepted and analyzed for the purpose of deciphering data associated with or encoded in the escaped radiation. For example, technology exists for reconstructing the image appearing on a computer monitor in a building from a remote location outside the building or from a location within a building by detecting certain wavelength frequencies from the monitor screen even if the monitor screen is not in view from the remote location. This is accomplished by known techniques wherein certain frequencies of light from the monitor screen, even after being reflected from various surfaces inside the building or room where the monitor is located, escape and are intercepted and analyzed by an eavesdropper in another location outside the building or room where the monitor is located. Obviously, the ability of an eavesdropper to intercept such radiation constitutes a significant security risk, which is desirably eliminated from facilities where secrecy is essential.\nAlthough walls, such as brick, masonry block or stone walls may effectively prevent the escape of light frequencies from a facility, radio frequencies pass through walls that are not properly shielded to prevent such passage. Moreover,"} +{"output_text": " the camera.\nIn the context of automated image analysis, lighting conditions are often referred to as the illumination of the subject matter. Illumination refers to the amount of light that is incident on the subject matter, and is typically measured in terms of the amount of light that is incident on the subject matter per unit area. Illumination is a function of the angle of the light source relative to the subject matter, the distance between the subject matter and the light source, and the color of the", "input_text": " as an aircraft, including the position and state of flight control surfaces, in an environment with highly dynamic lighting conditions.\n2. Description of the Related Art\nThe recording and automated analysis of image data is well known in the prior art. For example, optical character recognition, or OCR, is the process of analyzing an image of a document and converting the printed text found therein into machine-editable text. OCR programs are readily available and often distributed for free with computer scanners and word editing programs. OCR is a relatively simple task for modern software systems, as documents are typically presented with known lighting conditions (that is, an image of dark text on a light background, captured with the consistent, bright exposure light of a document scanning system) using predetermined character sets (that is, known and readily-available character fonts).\nSystems attempting to recognize handwritten text have the added challenge of handling the variations in personal handwriting styles from one person to the next. Still, these systems often require that the writers print the text instead of using cursive and that they follow certain guidelines when creating their printed characters. Even in these systems, where the individual style variations must be accounted for, the lighting conditions used to capture the text images are well-controlled and consistent.\nAnother example of automated image analysis is facial recognition. A facial recognition system is a computer application for automatically identifying a person from a digital image of the person's face. Facial recognition programs are useful in security scenarios, such as analyzing passengers boarding an aircraft in an attempt to identify known terrorists. A typical facial recognition program works by comparing selected facial features from the image, such as the distance between the person's eyes or the length of the nose, against a facial feature database. As with optical character recognition, facial recognition works best in controlled lighting conditions when the subject matter (that is, the face) is in a known orientation relative to"} +{"output_text": " pulsed.\nThe transceiver is the heart of the system. It is the device that receives the radio signals from the antenna and decodes the information contained in the signals. The transceiver is also the device that transmits the radio signals to the tag.\nThe tag is the device that is programmed with the unique information to be read. The tag is the conduit between the transceiver and the transponder. The tag contains the information to be read and the means to store it. The tag", "input_text": " analysis systems worldwide.\nSuch systems are designed to serve mass markets with many millions of labels needed per year. For example, Philips Semiconductors\"\" ICODE ICs represent the state-of-the-art in smart label technology, offering a low-cost, re-programmable and disposable solution for source tagging, automatic data capture, theft protection and data storage on a product or its packaging. ICODE smart labels allow almost any item to be tagged for efficient handling. ICODE\"\"s highly automated item scanning process does not require line-of-sight and can scan multiple labels at the same time.\nICODE smart labels offers considerable benefits in a broad variety of applications. In airline baggage tagging and parcel services, smart labels offer considerable advantages in sorting and item tracking. In supply chain management systems, smart labels overcome the limitations of barcode technology, providing improved product distribution; and in libraries and rental applications, they provide automated check-in, check-out and inventory control.\nAs shown in FIG. 1, a conventional RRIF system 10 consists of a tag reader 11 which is connected to a personal computer 12 (PC) through a serial port 13. The PC 12 takes action as it reads the trigger of a tag 14. Information can be exchanged via a communication medium 15 (e.g., Internet or Intranet) with a remote server 16.\nThe tag reader 11 typically consists of three components:\nAn antenna or coil;\nA transceiver (with decoder); and\nA transponder (commonly called an RF tag) that is electronically programmed with unique information.\nThe antenna emits radio signals to activate the tag and read and write data to it. The Antenna is the conduit between the tag and the transceiver. It helps control the system\"\"s data acquisition and communication. The electromagnetic fields produced by the antenna can be constantly present or"} +{"output_text": " crank is preferably mounted on a support that is fixed to the column. The support is preferably constituted by a plate that is fixed to the column and that is provided with a slot for receiving the crank. The slot is preferably arranged in a plane that is perpendicular to the axis of the column.\nThe turning control button is preferably constituted by a crank that is mounted on a support that is fixed to the column. The support is preferably constituted by a plate that is fixed to the column and that is provided", "input_text": " perform a new measurement. This measure mode is very useful for measuring the diameter of a hole or of a rod, for example.\nThese systems have the inconvenience of requiring an additional command wheel which increases the cost of the system and whose use is not very intuitive. Furthermore, it is necessary to let go for a while of the turning control button 8 to actuate the control wheel.\nIt is thus an aim of the present invention to propose a method for entering commands to switch the measure mode in a column for measuring vertical dimensions that avoids the inconveniences of the prior art methods, as well as a measuring column that is improved and easier to use than the measuring columns of the prior art.\nAccording to the invention, these aims are achieved by means of a method and of a measuring column having the characteristics of the corresponding independent claims, variants of preferred embodiments being moreover described in the dependent claims.\nIn particular, these aims are achieved by means of a method enabling a command to switch the measure mode to be entered in a dimension-measuring column, wherein this mode-switch command is entered only by acting on the angular position of a turning control button.\nThis method has the advantage that the mode switch is effected by moving the turning control button towards a predetermined angular position, different from the angular position range used for vertically displacing the probe tip. The mode-switch commands can thus be entered without it being necessary to let go of the turning button.\nThe The dimension-measuring column of the invention can function according to a limited and discrete number of different measure modes that can be selected by means of one of the turning control buttons. Each measure mode can furthermore call for continuous or quasi-continuous parameters that depend for example on the angular position of the turning control button between two predetermined thresholds.\nThe turning control button used is preferably constituted by the crank allowing the probe tip to be displaced vertically. The"} +{"output_text": "yraldehyde, n-butyral, n-butyric acid, n-butyric alcohol, n-butyric acid methyl ester, n-butyric acid ethyl ester, n-butyric acid isopropyl ester, n-butyric acid isobutyl ester, n-butyric acid isopropyl ester, n-butyric acid isobutyl ester, n-butyric acid isopropyl ester, n-buty", "input_text": "onic acid (see K. Achiva, et. al., Tet. Lett., 1475 (1978). Chirality of these ligands is also obtainable by the use of two different hydrocarbyl groups attached to the tetrahedral phosphorus atoms, such as bis(phenyl-n-butylphosphino) diphosphine chelate ligands, which could also be resolved into optical isomers through means known in the art.\nThe ligands of this invention also would find utility in the nickel catalyzed cross coupling reactions of Grignard reagents with aryl and vinyl halides, a reaction which has been carried out with similar chelating diphosphine ligands as disclosed by K. Yamamoto, et. al., Tet. Lett. 3 (1974) and M. Kumada, et. al., J. Amer. Chem. Soc., 98, 3718 (1976).\nThe present hydroformylation process in its broad sense comprises contacting at least one olefin having from 2 to about 20 carbon atoms in a reaction zone at a temperature of from about 20.degree. C. to about 250.degree. C. and a pressure of from about 15 psig to about 800 psig with syn gas (H.sub.2, CO) and a catalyst comprising rhodium in chemical complex with one or more of the above chelating diphosphino ligands for a sufficient period of time to permit reaction of said olefin with said syn gas to form aldehyde product.\nThe present ligands, in particular, those of Examples 2, 3, 4, 5, 7, 8, 9 and 10 of TABLE I below have special utility as a bidentate ligand modifier for the low pressure rhodium hydroformylation of alpha-olefins to prepare aldehyde products with unusually high ratios of normal to branched isomers in high yield. Such products from propylene include n-but"} +{"output_text": " Nature 385:721-725.\nThe Bcl-2 family of proteins includes both pro-apoptotic and anti-apoptotic proteins. The pro-apoptotic proteins include Bax, Bak, Bad, Bid, Bik, Bim, Nbk, and Bmf. The anti-apoptotic proteins include Bcl-2, Bcl-xL, Bcl-w, Mcl-1, A1, and A2. The anti-apoptotic proteins are thought to function by binding to and neutralizing", "input_text": " and reperfusion.\nIn Alzheimer's disease, Parkinson's disease, Huntington's chorea, epilepsy, amyotrophic lateral sclerosis, stroke, ischemic heart disease, spinal cord injury and many viral infections, for example, abnormally high levels of cell death occur. In at least some of these diseases, there is evidence that the excessive cell death occurs through mechanisms consistent with apoptosis. Among these are 1) spinal cord injury, where the severing of axons deprives neurons of neurotrophic factors necessary to sustain cellular viability; 2) stroke, where after an initial phase of necrotic cell death due to ischemia, the rupture of dead cells releases excitatory neurotransmitters such as glutamate and oxygen free radicals that stimulate apoptosis in neighboring healthy neurons; and 3) Human Immunodeficiency Virus (HIV) infection, which induces apoptosis of T-lymphocytes.\nIn contrast, the level of apoptosis is decreased in cancer cells, which allows the cancer cells to survive longer than their normal cell counterparts. As a result of the increased number of surviving cancer cells, the mass of a tumor can increase even if the doubling time of the cancer cells does not increase. Furthermore, the high level of expression in a cancer cell of the bcl-2 gene, which is involved in regulating apoptosis and, in some cases, necrotic cell death, renders the cancer cell relatively resistant to chemotherapeutic agents and to radiation therapy.\nIn recent years, a family of proteins has been discovered that controls apoptosis. The prototype of this family is Bcl-2, a protein that inhibits most types of apoptotic cell death and is thought to function by regulating an antioxidant pathway at sites of free radical generation. Hockenbery et al. (1993) Cell 75:241-251. More recent data suggests that Bcl-2 can also function as a channel protein and as an adaptor/docking protein. Reed, et al. (1997)"} +{"output_text": " conductive member providing electromagnetic shielding coming into contact with a conductive cover member of a composite electronic component.\nIn order to solve the above-mentioned problems, a surface acoustic wave device having a structure shown in FIG. 20 has been proposed. In FIG. 20, a surface acoustic wave device 1 includes a piezoelectric substrate 2, a pair of comb-like electrodes 3 and 4 disposed on the piezoelectric substrate 2, and a pair of reflectors 5 and 6 disposed on the piezoelectric substrate 2. The comb-", "input_text": " effect the characteristics of the surface acoustic wave device and the composite electronic component using the surface acoustic wave device.\nReduction in the height of an electronic component to be mounted is necessary with the miniaturization and reduction in thickness of a mobile or cellular telephone. Therefore, due to a physical impact and stress applied to a mobile or cellular telephone and conductive floating substances, such as solder scraps and dust, the conductive member providing electromagnetic shielding may come into contact with the conductive cover member of the composite electronic component in the package of the surface acoustic wave device. Accordingly, when these members come into contact, the parasitic inductance component is changed, resulting in deterioration in characteristics of the surface acoustic wave device.\nThat is, in a composite electronic component having a surface acoustic wave device built therein with a ladder-type circuit structure shown in FIG. 19, for example, there are inductances L1 to L3 generated by bonding wires in the surface acoustic wave device, an inductance L4 generated by wiring or a through-hole electrode in the package of the surface acoustic wave device, and an inductance L5 generated by wiring disposed in a case of the composite electronic component. In addition, in FIG. 19, numeral P1 denotes a ground terminal of the surface acoustic wave device package. When a conductive member providing electromagnetic shielding comes in contact with a conductive cover member of the composite electronic component so as to be connected, a connecting route shown in a dotted line X is formed. Therefore, the inductance between the parallel arm resonator in the surface acoustic wave device and the ground potential is changed, resulting in deteriorating in characteristics as mentioned above.\nIn addition, not only in a surface acoustic wave device with a ladder-type circuit structure having a parallel arm resonator, but also in a surface acoustic wave device with another structure, deterioration in characteristics may be produced by changes in the inductance component due to a"} +{"output_text": " silicon oxide film. A floating diffusion layer (FD) 307 is provided at the bottom surface of the voltage conversion section 304.\nA transfer gate electrode 308 of a reset transistor 311 is provided on the p-type semiconductor layer 301, which is between the photodiode 302 and the voltage conversion section 304, via a gate insulating film 309 made from a silicon oxide film. A floating diffusion layer (FD) 309 is provided at the bottom surface of the voltage conversion section 304.\nA transfer gate", "input_text": " the side surfaces and the bottom surface of the p-type surface diffusion layer 208. In this manner, by forming the device separation section with the p-type surface diffusion layer 208, a stress upon a substrate by the device separation insulating film 207 that is formed by STI is reduced, and thus leak current can be suppressed.\nFurther, Reference 3 proposes, for example, a structure in which a thick oxide film is formed on a silicon substrate, and an impurity diffusion layer is provided below the oxide film for device separation to prevent the generation of dark current without employing LOCOS or STI for device separation. This will be described with reference to Portion (a) of FIG. 11 and Portion (b) of FIG. 11.\nPortion (a) of FIG. 11 is a top view showing an exemplary structure of a pixel section 300 in a conventional solid-state image capturing device disclosed in Reference 3. Portion (b) of FIG. 11 is a longitudinal cross-sectional view of a portion cut by line F-F\u2032 in Portion (a) of FIG. 11.\nAs shown in Portion (a) of FIG. 11 and Portion (b) of FIG. 11, in the pixel section 300 in the conventional solid-state image capturing device, an n-type photodiode 302 having an n-type impurity implanted therein is formed at the top surface of a p-type semiconductor layer 301 and a p-type surface diffusion layer 303 is formed at the top surface of the photodiode 302 to form an embedded photodiode structure, as in the cases shown in FIG. 9 and FIG. 10.\nA transfer gate electrode 306 of a charge transfer transistor 310 is provided on the p-type semiconductor layer 301, which is between the photodiode 302 and a voltage conversion section 304, via a gate insulating film 305 made from a"} +{"output_text": "1000V) and lower than the maximum voltage (1800V).\nIn the above application method, the circuit voltage (1000V) is set to be the maximum voltage (1800V) and the maximum current (200 A) is set to be the maximum current (200 A) which is allowed to flow in the circuit. Thus, the circuit voltage (1000V) is set to be the maximum voltage (1800V) and the maximum current (200 A) is set to be the maximum", "input_text": " to clamp the surge voltage. This operation occurs at each turn-OFF time. Since the energy loss corresponds to an excessively charged amount of charges, the clamp type snubber circuit has an advantage that the energy loss is small in comparison with a completely discharging type snubber circuit.\nHowever, the surge voltage suppressing function of a snubber circuit using a capacitor, i.e., either of the completely discharging type snubber circuit and the clamp type snubber circuit, has a disadvantage that the magnitude of the generated surge voltage varies, depending on the magnitude of an interruption current. For example, a surge voltage of 400V is generated when the interruption current is 100 A as shown in FIG. 36 and a surge voltage of 800V is generated when the interruption current is 200 A as shown in FIG. 37. Thus, the surge voltage varies according to the magnitude of the interruption current.\nIn this case, 100 A corresponds to 100% of a normally used current area of the switching element and 200 A is an excessive current set value such as an accident current and corresponds to 200% of the normal current. In the above application method, a circuit a voltage (1000V)+surge voltage (800V)+marginal amount (200V)=2000V is used as the breakdown voltage of the element. That is, the breakdown voltage of the element becomes approximately 200% of the circuit voltage.\nThe above relation can be expressed by use of a reverse bias safe operating area (RBSOA) which is necessary for the characteristic of the element, as shown in FIG. 38. That is, it is necessary to safely interrupt the maximum excessive current at the circuit voltage (1000V) and safely interrupt the steady-state current at the maximum voltage (1800V). As is clearly seen from FIG. 38, the safe operation is required in an area higher than the circuit voltage ("} +{"output_text": " a patient.\nIn the case of a plethysmograph, a body part of a patient is clamped between two clamps, which are connected to a plethysmograph. The body part is then inflated with a gas, such as air, and the pressure in the body part is measured. The pressure is measured by means of a pressure sensor, which is connected to the clamps. The pressure sensor is connected to a computer, which is programmed to calculate the circumference of the body", "input_text": " of copper, is exposed to corrosion, the individual layers forming a galvanochemical element, which has a tendency to undergo undesired chemical reactions.\nThe necessary layers and method steps for the production of such a terminal or wiring device are generally sputtering on of an adhesive or carrier layer 11, sputtering on of a copper carrier layer (not represented), carrying out of a photolithographic process for the structuring of the sputtered-on metallizations 11, depositing of a copper interconnect layer 12, depositing of a nickel layer as a barrier or buffer layer 40, depositing of a gold layer 41 as protection and, finally, removal of the structured photomask and etching of the carrier layer in regions in which the structured photomask was previously provided.\nIn such a sequence of layers, the conductivity is determined by the deposited or plated copper layer 12. An improvement in the conductivity means increasing the depositing or plating time, which is associated directly with the process or production costs. To realize the same high conductivity as in the case of a BGA connection according to FIG. 4, which has an interposer 32 or base, the depositing or plating costs for a CSP/WLP terminal or wiring device as illustrated in FIG. 6 or FIG. 5 would not be economical. The present invention relates to an apparatus for measuring a variation in a circumference of a body part and method for plethysmography.\nPlethysmography is a procedure which has been known for some time now and which is used for determining macro- and microvascular parameters in the extremities, such as the venous capacity, the venous reflux, the venous elasticity, the venous outflow rate, the material blood flow and the capillary filtration rate. In general, plethysmography allows qualitative and quantitative statements to be made concerning the state and function of the macro- and microvascular circulation in an extremity of"} +{"output_text": ", the program voltage is applied to the control gate as a series of pulses. The magnitude of the pulses is increased with each successive pulse by a predetermined step size (e.g. 0.2 v). In the periods between the pulses, verify operations are carried out. That is, the programming level of each memory cell of a group of memory cells being programmed in parallel is read between successive programming pulses to determine whether it is equal to or greater than a verify level to which it is being programmed", "input_text": " of the device. A series of devices, which require a range of skill levels to operate, can be used in a designed grasping rehabilitation or grasping improvement program for patients, musicians and athletes alike. 1. Field of the Invention\nThe present invention relates generally to technology for memory devices and, more specifically, to detecting whether memory devices have been over programmed.\n2. Description of the Related Art\nSemiconductor memory devices have become more popular for use in various electronic devices. For example, non-volatile semiconductor memory is used in cellular telephones, digital cameras, personal digital assistants, mobile computing devices, non-mobile computing devices and other devices. Electrical Erasable Programmable Read Only Memory (EEPROM) and flash memory are among the most popular non-volatile semiconductor memories.\nTypical EEPROMs and flash memories utilize a memory cell with a floating gate that is provided above and insulated from a channel region in a semiconductor substrate. The floating gate is positioned between source and drain regions. A control gate is provided over and insulated from the floating gate. The threshold voltage of the memory is controlled by the amount of charge that is retained on the floating gate. That is, the minimum amount of voltage that must be applied to the control gate before the memory cell is turned on to permit conduction between its source and drain is controlled by the level of charge on the floating gate.\nSome EEPROM and flash memory devices have a floating gate that is used to store two ranges of charges and, therefore, the memory cell can be programmed/erased between two states. When programming an EEPROM or flash memory device, a program voltage is applied to the control gate and the bit line is grounded. Electrons from the p-well are injected into the floating gate. When electrons accumulate in the floating gate, the floating gate becomes negatively charged and the threshold voltage of the memory cell is raised.\nTypically"} +{"output_text": " particles. The hydrogen-containing gas is passed through the reaction zone at a temperature of about 1000.degree. F. to about 2000.degree. F. and at a pressure of about 100 to about 1000 psig. The hydrogen-containing gas is passed through the reaction zone at a rate of about 0.5 to about 10 volumes of gas per volume of coal per minute. The hydrogen-containing gas is passed through the reaction zone at a rate of about 0.5 to about 10 volumes", "input_text": " the operating costs of the process are exceptionally high because of the large hydrogen requirements.\nThe use of fluidized systems wherein a fluidized stream of finely divided coal particles and/or heated char particles is formed in a carrier stream to pyrolyze the coal particles, extracting the volatiles therefrom, is well known in the art. The heated char particles and/or the carrier gas stream are utilized to provide the requisite heat of pyrolysis to the coal particles. A supply of heated char is continuously produced upon pyrolysis of the coal in the system. Sulfur contaminants may be removed by the addition of sulfur acceptors such as iron oxides or lime to the particulate coal prior to processing or by heating the products to high temperatures in the presence of hydrogen upon removal of the products from the pyrolysis zone. Alternatively, desulfurization may be achieved during pyrolysis by enriching the carrier gas stream with hydrogen, which may be generated within the process by known gasification methods. Exemplary of such systems are: U.S. Pat. Nos. 3,007,849; 3,702,516; 3,736,233. Additional references relating to the pyrolysis method which are considered of some pertinency are found in Coal Processing Technology, Vol. 2, American Institute of Chemical Engineers, New York, N.Y. (1975), pp. 83-93, 119-120.\nAnother method employed to reduce the sulfur content of high-sulfur coal is the gasification of coal with steam and air or oxygen to produce fuel gas which must then be desulfurized prior to combustion. For example, U.S. Pat. No. 2,634,286, teaches hydrogenation of coal in the dry state by passing a stream of heated, hydrogen-containing gas upwardly through a reaction zone containing a mass of the substantially dry coal"} +{"output_text": " used as a feedstock for the production of sulfur-containing compounds. The hydrocarbon-containing fluid can be a petroleum-based fluid, such as a petroleum distillate, or a synthetic fluid, such as a Fischer-Tropsch product.\nThe sulfur-containing fluid can be a petroleum-based fluid, such as a petroleum distillate, or a synthetic fluid, such as a Fischer-Tropsch product. The sulfur-containing fluid can be a petroleum-based fluid,", "input_text": " in an activation zone maintained at a temperature which is more than about 300 and less than about 1,000xc2x0 F., thereby providing an activated sorbent.\nIn accordance with a further aspect of the present invention, there is provided a desulfurization process comprising, consisting essentially of, or consisting of the steps of: (a) contacting a sulfurized sorbent comprising a promoter metal and zinc sulfide with an oxygen-containing stream in a regeneration zone under regeneration conditions sufficient to convert at least a portion of the zinc sulfide to zinc oxide, thereby providing a desulfurized sorbent, the regeneration conditions including an average sulfur dioxide partial pressure of from about 0.1 to about 10 psig; (b) contacting at least a portion of the desulfurized sorbent with a hydrogen-containing stream in an activation zone under activation conditions sufficient to reduce the valence of the promoter metal, thereby providing an activated sorbent; and (c) contacting at least a portion of the activated sorbent with a sulfur-containing fluid comprising at least about 50 ppmw sulfur in a desulfurization zone under desulfurization conditions sufficient to provide a desulfurized fluid comprising less than about 50 weight percent of the amount of sulfur in the sulfur-containing fluid, wherein at least about 50 weight percent of the sulfur in the sulfur-containing fluid is present in the form of organosulfur compounds.\nIn accordance with one embodiment of the present invention, a novel process is provided for desulfurizing a sulfur-containing fluid by contacting the sulfur-containing fluid with a sorbent and thereafter regenerating and activating or re-activating the sorbent.\nThe sulfur-containing fluid employed in the process of the present invention is preferably a hydrocarbon-containing fluid comprising a quantity of sulfur compounds therein. Preferably, such hydrocarbon-containing fluid can be used as a fuel or can be"} +{"output_text": ".\nAccording to a preferred embodiment of the present invention, the error analysis information includes a plurality of error analysis data files, and the monitoring terminal apparatus instructs a server, defined by the resource definition data, to save the error analysis information, in response to reception of one error notifying message.\nAccording to a preferred embodiment of the present invention, the error analysis information includes a plurality of error analysis data files, and the monitoring terminal apparatus instructs a plurality of servers, defined by the resource", "input_text": " a software identifier and at least one set of resource definition data defining resources related to saving operation of error analysis information, and wherein when the monitoring terminal apparatus receives an error notifying message including the software identifier from any one of the servers, the monitoring terminal apparatus instructs a server, defined by a data record corresponding to the software identifier as one of the resources, to save the error analysis information.\nMore specifically, each data record in the management table includes at least one set of resource definition data defining a server to perform saving operation on error analysis information, a data file including the error analysis information and an output file where the error analysis information is saved, and the monitoring terminal apparatus designates the data file and the output file defined by the resource definition data, and instructs the server to save the error analysis information.\nAccording to a preferred embodiment of the present invention, at least one of the data records stored in the management table includes plural sets of resource definition data corresponding to one index code, and the monitoring terminal apparatus instructs a plurality of servers, defined by the plural sets of resource definition data, to save the error analysis information, in response to reception of one error notifying message.\nFurther, according to the preferred embodiment of the present invention, the index code of each data record stored in the management table includes an additional code accompanying the software identifier, indicative of error type, and the error notifying message transmitted from the server has a message identifier including the software identifier and the additional code indicative of the type of an error detected in the server, further, when the monitoring terminal apparatus receives an error notifying message from any one of the servers, the monitoring terminal apparatus searches the management table based on the message identifier of the received message, and instructs saving of the error analysis information if it is determined that a specific type of error has occurred in a specific software program designated in advance in the management table"} +{"output_text": " of these infections are caused by bacteria that are resistant to antibiotics.\nThe use of antibiotics in the food industry has been banned in the United States since the early 1970s. However, the use of antibiotics in animal feed has been allowed since the early 1980s. The use of antibiotics in animal feed has been allowed because of the belief that the antibiotics are needed to prevent disease in the animals. The use of antibiotics in animal feed has been allowed because of the belief that the antibiotics are needed to prevent", "input_text": "18,084 and 3,437,082 disclose a variety of spring, ball, and sleeve configurations. There is no means in any these patents by which the spring members are prevented from cylinderizing. More importantly however, the flow paths or channels in each of the above listed patents can be severely diminished and restricted.\nIn summation, there is a definite need for a flow fitting to withstand high localized pressures, to accept very heavy sealants and lubricants in order to prolong equipment life, and to provide substantially unrestricted flow channels in which the injected sealants, lubricants, or the like, could travel without plugging-off the fitting. The present invention relates to an antibacterial aqueous solution comprising a phosphate, a citrate, and a silicate. The present invention is also related to a method of controlling bacterial contamination and/or growth in a food substance, a method of prohibiting the formation of, and/or facilitating the removal of, silicate aggregation on a metal substrate, and a method of, for environmental protection purposes, reducing phosphate usage in industrial antibacterial processes.\nBacteria live everywhere in our environment, air, soil, rock, and water. Many bacteria are pathogenic and can cause diseases such as Botulism food poisoning, E-coli food poisoning, Cholera, Whooping Cough, Plague, Scarlet fever, Diphtheria, Tuberculosis, Typhoid fever, Anthrax, and so on and so forth. The extent of food borne infections in the United States was quantitatively documented in the CAST report of 1994 (Foodborne Pathogens: Risks and Consequences. Task Force Report No. 122, Council for agricultural Science and Technology, Washington D.C.), and has been extensively characterized in the past few years (CDC. 1988c. 1997 Final FoodNet Surveillance report. U.S. Department of Health and Human Services, October, 1998). Many"} +{"output_text": " have knowledge of the surrounding terminals. This means that each terminal is required to have knowledge of the existence of the other terminals. This is referred to as the hidden node problem.\nIn a Mesh topology, the existence of a terminal is not known to the other terminals. This means that the terminal cannot be aware of the existence of the other terminals. This means that the terminal cannot be aware of the existence of the other terminals. This means that the terminal cannot be aware of the existence of the", "input_text": "Personal Area Network: 10-20 m range) standards such as ZigBee adopt Adhoc/Mesh topology where there is no central coordinating entity to control the traffic flow, and where the scheduling/routing of data transmission is managed in a distributed manner. The Mesh characteristics allow data to be conveyed beyond the PAN range by multihopping the information via a series of neighbouring devices. Because each transmission link is kept short, the power consumption per terminal is kept low.\nHowever, in order to reliably convey information from a source to a destination, this topology encounters several problems.\nRouting/Scheduling Complexity\nOne characteristic of Mesh topology is that there could be multiple routes from source to destination. FIG. 1 schematically illustrates a Mesh network topology. Seven terminal devices A to G are shown. In order to transfer information from terminal A to terminal G, route 1 (solid arrows) or route 2 (dashed arrows) can be taken. The arrows accompanied by question marks indicate alternative routes available for transmission from a given terminal. In particular, terminal A needs to make a decision whether to transmit its data first to terminal B (via solid arrow), or to terminal C (via dashed arrow). Terminal D needs to make a similar decision. This means that each terminal in a Mesh topology requires knowledge of the existing surrounding terminals and requires the capability to select the optimum route to a given destination, which intuitively requires significant intelligence.\nFurthermore, in an energy conserving system, some terminals may be required to switch into a hibernation mode when they are not required to receive or transmit. In such a scenario, because there is no central coordinator, each terminal is required to have knowledge of when the neighbouring terminals are capable to receive information, which would impact on how they schedule transmission.\nHidden Node Problem\nAs described above, in a Mesh topology each terminal is required to"} +{"output_text": " top end of the probe, wherein the cantilever is vibrated in a lengthwise direction of the probe.\nAlso, to achieve the above objects, according to the present invention, there is provided a near-field optical microscope comprising: an illumination part for illuminating a sample surface with light; a cantilever having a probe and a probe hold part, with a top end part of the probe positioned near the sample; and an objective optical system for receiving scattered light generated at a top end", "input_text": " has an object of providing a near-field optical microscope which can detect scattered light over a wider angle range by arranging the structure of a cantilever, and the cantilever for the near-field optical microscope.\nTo achieve the above objects, according to the present invention, there is provided a near-field optical microscope comprising: an illumination part for illuminating a sample surface with light; a probe provided at a position near the sample surface illuminated with the light; a light detection part for detecting light scattered by the probe; and a scanning part for scanning the sample and a top end of the probe relatively to each other, wherein the top end of the probe is a top end of an extending part extending in one direction from a body of the probe, in a side of the top end of the extending part, the extending part is at most three times or less as thick as a tope end diameter, over a length of a wavelength of the illuminating light, and the near-field optical microscope further comprises means for vibrating the probe in a lengthwise direction of the extending part.\nAlso, to achieve the above objects, according to the present invention, there is provided a probe used for a near-field optical microscope, comprising: a probe body; and an extending part extending in one direction from the probe body, wherein in a side of a top end of the extending part, the extending part is at most three times or less as thick as a top end diameter, over a length of 700 nm from the top end.\nAlso, to achieve the above objects, according to the present invention, there is provided a near-field optical microscope comprising: an illumination part for illuminating a sample surface with light; a cantilever having a probe and a probe hold part, with a top end part of the probe positioned near the sample; and an objective optical system for receiving scattered light generated at a"} +{"output_text": " to the slice determined by the reconstructing range determining unit.\nAccording to another aspect of the present invention, there is provided, as shown in FIG. 4, an X-ray computerized tomography apparatus, comprising: an X-ray detection unit 23 for detecting transmission X-rays from a plurality of directions Irradiated from an X-ray beam generation source 21 and transmitted through a subject; a data acquisition unit 27 for acquiring transmission data according to the transmission X-rays detected by the X", "input_text": " as an insertion object inside the subject exists.\nFurther, it is still another object of the present invention to provide an X-ray computerized tomography apparatus capable of improving the operation efficiency of the photographing of a target organ inside a subject.\nIn order to achieve the above objects, a first feature of the present invention resides in directly detecting a position of an object inside a subject from transmission data acquired (i.e., projection data). Based on the information of the detected position, it is possible to determine a range in which an image should be reconstructed, a range in which an image should be displayed (visualized), or a range in which a subject should be scanned, and to carry out a prompt processing in a necessary range.\nFurther, a second feature of the present invention resides in displaying arbitrary data among acquired transmission data, together with a display image of a reconstructed image. By displaying this transmission data, It is possible to easily understand in real time the progress state of an insertion object three-dimensionally.\nAccording to one aspect of the present invention, there is provided, as shown in FIG. 3, an X-ray computerized tomography apparatus, comprising: an X-ray detection unit 23 for detecting transmission X-rays from a plurality of directions Irradiated from an X-ray beam generation source 21 and transmitted through a subject; a data acquisition unit 27 for acquiring transmission data according to the transmission X-rays detected by the X-ray detection unit; an object position detection unit 31 for detecting a position of an object inside the subject, according to a part of the transmission data acquired by the data acquisition unit; a reconstructing range determining unit 46 for determining a slice to be image-reconstructed, according to the position detected by the object position detection unit; and an image reconstruction unit 45 for reconstructing a tomographic image of a slice in which the object exists, according"} +{"output_text": " the electron projection lithography device is used.\nAccording to the invention, the electron projection lithography device is used at layers such as an isolation layer, a gate level, a contact hole layer, and a wiring layer just after the gate level, where pattern formation is difficult by the photolithography device. At other layers to be sufficiently processed even by the photolithography, the electron projection lithography device is used.\nAccording to the invention, the electron projection lithography device is used at", "input_text": " transcribed patterns by 50 times or more, and a thickness of the stencil mask has become thin to 5 xcexcm or lower. Therefore, another object of the present invention is to provide a method of setting a beam interval, which prevents bending in a stencil mask.\nA micro-beam provided for the purpose of preventing bending or the like can be made sufficiently thin to make projection of the patterns difficult. However, this may cause a problem such as narrowing, where the transcribed patterns become large or small in size locally at the micro-beam portion. Therefore, another object of the present invention is to suppress pattern deformation at a micro-beam portion by providing a forming place, a shape and a material of an optimal micro-beam, and a projection method.\nAs described above, throughput and resolution greatly varied depending on projection devices and methods, and required throughput and resolution were never satisfied simultaneously. Thus, regarding the two types of devices, i.e., photolithography having high throughput, and electron projection lithography having throughput low compared with that of the photolithography but still relatively high, and a high resolution capability, the present invention presents a projection device and a projection method capable of obtaining highest throughput while satisfying required accuracy and required resolution for each type and layer. The invention also presents a method of manufacturing a semiconductor device, which makes effective selection of two types of projection methods, i.e., non-complementary and complementary reticles, so as to obtain highest throughput while satisfying required accuracy and required resolution, when the electron projection lithography device is selected.\nAccording to the invention, the electron projection lithography device is used at layers such as an isolation layer, a gate level, a contact hole layer, and a wiring layer just after the gate level, where pattern formation is difficult by the photolithography device. At other layers to be sufficiently processed even by the photolithography,"} +{"output_text": " to solve the above problem. The Mobile IP is a scheme in which the mobile terminal is assigned with a home address (HoA) which is a fixed address in the home network and a care-of address (CoA) which is a fixed address in the visited network. The mobile terminal can make an access to the Internet by using the home address and the care-of address.\nThe Mobile IP is a scheme in which the mobile terminal is assigned with a home address (HoA) which", "input_text": "P table in which IP addresses and MAC addresses of terminals moving over plural subnets are set in correspondence. When a packet destined to another subnet is entered from one subnet, this packet is directly sent to the destination terminal by looking up the ARP table. In this scheme, however, only the communications between subnets which are directly connected to the switch node are possible and the transfer processing for a subnet which is not directly connected cannot be done because there is no routing processing, so that there is a need to use the switch node and the usual router simultaneously.\nThus, currently there are intensive research and development activities on a \u201chigh speed router device\u201d for realizing the fast IP packet transfer by resolving the bottleneck of the network layer processing in the router device. On the other hand, there are also research and development activities for a technique to accommodate mobile terminals in Internet type network. Such a mobile access technique includes a scheme using DHCP (Dynamical Host Configuration Protocol) server and a scheme using Mobile IP.\nThe scheme using DHCP server is a scheme in which the mobile terminal makes an Internet access by temporarily obtaining an IP address from the DHCP server within the network. The problem associated with this scheme using DHCP server is that the strategy to utilize the IP address dynamically obtained from the network of the visited site works well in the case where the mobile terminal makes an access to a server in an internal network, that is, the case where the mobile terminal is a call originating side, but it does not work well in applications where the mobile terminal can be a call terminating side such as Internet telephone and electronic conference system. Namely, in such applications, it is difficult for the other machines to ascertain the IP address currently used by the mobile terminal so that it is practically impossible to make an access to the mobile terminal from the other machines.\nThe Mobile IP is a scheme developed in order"} +{"output_text": ")oxy]-pyrimidine; 4-(xcex1,xcex1,xcex1,4-tetrafluoro-N-ethyl-m-toluidino)-6-[(xcex1,xcex1,xcex1,4-tetrafluoro-m-tolyl)oxy]-pyrimidine; 4-(xcex1,xcex1,xcex1,4-tetrafluoro-N-propyl-m-toluidino)-6-[(x", "input_text": " preferred for use in the method of invention are fenpyroximate, acequinocyl, diafenthiuron, fenazaquin, pyridaben and pyrimidifen.\nMany pyrimidine compounds, including the formula I pyrimidine compounds, methods for their preparation and the insecticidal and acaricidal uses thereof are described in U.S. Pat. No. 5,707,995 and WO 98/12184. Among the broad class of pyrimidine compounds described, surprisingly, it has now been found that those particular pyrimidine compounds of formula I are useful for the protection of beneficial insects from infestation and damage caused by parasitic mites.\nPreferred pyrimidine compounds of formula I useful in the method of the invention are those compounds wherein X1 and X2 are each O or NR;\nR1, R3, R8 and R10 are each independently hydrogen or trifluoromethyl, with the proviso that at least one of R1, R3, R8, and R10 must be trifluoromethyl;\nR2 and R9 are each independently hydrogen, chlorine or fluorine; and\nR4, R5, R6, and R7 are hydrogen.\nMore preferred formula I pyrimidine compounds useful in the inventive method are 4-[(4-chloro-xcex1,xcex1,xcex1-trifluoro-m-tolyl)oxy]-6-[(xcex1,xcex1,xcex1, 4-tetrafluoro-m-tolyl)oxy]-pyrimidine; 4-(xcex1,xcex1, xcex1,4-tetrafluoro-N-methyl-m-toluidino)-6-[(xcex1,xcex1,xcex1, 4-tetrafluoro-m-tolyl"} +{"output_text": " the high-permittivity film 408c.\nThe high-permittivity film 408c is formed by a chemical vapor deposition (CVD) method. The high-permittivity film 408c is formed by a chemical vapor deposition method using a material such as tantalum pentoxide (Ta.sub.2 O.sub.5) or barium strontium titanate (Ba.sub.x Sr.sub.y TiO.sub.3, BST", "input_text": " present invention relates to a memory cell suitable for applications to a highly integrated semiconductor memory and, more particularly, to a capacitor constituting a memory cell and a method of manufacturing the capacitor.\n2. Description of the Prior Art\nA memory cell (to be referred to as an 1T cell hereinafter) constituted by one transistor and one capacitor is known as a highly integrated semiconductor memory cell. The 1T cell is very popular because it requires a small number of constituent elements and facilitates a reduction in memory cell area.\nAn output voltage from a 1T cell is proportional to the capacitance value of a capacitor (to be referred to as a cell capacitor hereinafter) constituting a memory cell. For this reason, to assure the stable operation in a highly integrated arrangement, the capacitance value of the cell capacitor must be sufficiently large. To highly integrate 1T cells, cell capacitors each having a sufficiently large capacitance value in a small area are required.\nA capacitor using a high-permittivity film, as described in IEDM Technical Digest 1991, pp. 823-826, is known as a typical conventional cell capacitor. This conventional cell capacitor is shown in FIG. 1.\nAs shown in FIG. 1, the cell capacitor has a silicon substrate 401 having a major surface. A silicon oxide film 402 is formed on the major surface of the silicon substrate 401. A plurality of contact holes are formed in the silicon oxide film 402. Impurity-doped polysilicon members 403 are buried in the plurality of contact holes, respectively. The silicon substrate 401 is electrically connected to a plurality of storage electrodes 406c each consisting of a tantalum film 404c and a platinum film 405c. A high-permittivity film 408c used as a capacitance film is formed on the entire surface including the plurality of storage electrodes 406c and the silicon oxide film 402. A counter electrode 409c is stacked on"} +{"output_text": " the Offshore Technology Conference\u201d, paper presented at the 1998 Offshore Technology Conference held in Houston Tex. from 4 to 7 of May 1998, pp. 699-712. (g) Lundberg et al.; \u201cSpin-off Technologies from the Offshore Technology Conference\u201d, paper presented at the 1998 Offshore Technology Conference held in Houston Tex. from 4 to 7 of May 1998, pp. 699-712. (h) Lundberg et al.; \u201cSpin-off Technologies from", "input_text": "148,866; U.S. Pat. No. 6,286,558; U.S. Pat. No. 6,004,639; U.S. Pat. No. 6,361,299\nOther relevant foreign patent documents related to fabricating composite umbilicals include the following, entire copies of which are incorporated herein by reference:\nDE 421-4383; EP 0024512; EP 352148; EP 505815; GB 553,110; GB 2255994; GB 2270099\nOther relevant publications related to fabricating composite umbilicals include the following, entire copies of which are incorporated herein by reference: (a) Fowler Hampton et al.; \u201cAdvanced Composite Tubing Usable\u201d, The American Oil & Gas Reporter, pp. 76-81 (September 1997). (b) Fowler Hampton et al.; \u201cDevelopment Update and Applications of an Advanced Composite Spoolable Tubing\u201d, Offshore Technology Conference held in Houston Tex. from 4 to 7 of May 1998, pp. 157-162. (c) Hahan H. Thomas and Williams G. Jerry; \u201cCompression Failure Mechanisms in Unidirectional Composites\u201d, NASA Technical Memorandum pp 1-42 (August 1984). (d) Hansen et al.; \u201cQualification and Verification of Spoolable High Pressure Composite Service Lines for the Asgard Field Development Project\u201d, paper presented at the 1997 Offshore Technology Conference held in Houston Tex. from 5 to 8 of May 1997, pp. 45-54. (e) Haug et al.; \u201cDynamic Umbilical with Composite Tube (DUCT)\u201d, Paper presented at the 1998 Offshore Technology Conference held in Houston Tex. from 4 to 7 of May, 1998, pp. 699-712. (f) Lundberg et al.; \u201cSpin-off Technologies from"} +{"output_text": "-sheet tray even if the upper main body portion is opened.\nThe present invention provides an image-forming apparatus comprising:\na main body having a first side wall and a second side wall, the first side wall having a first opening and a second opening, the second side wall having a third opening and a fourth opening, the first opening and the third opening being aligned with each other;\na first manual feeder tray having a first side edge and a second side edge, the first side edge", "input_text": " main body and the upper main body. The folded manual feeder tray is energized inward to the side wall by a spring or the like not to leave the side walls by its own gravity when it is not used. The manual feeder tray, when it is used, is opened by pulling out the second side edge around the first side edge as the turning axis.\nThe sheet discharge unit has usually a discharged-sheet tray for holding image-carrying discharged sheets. This discharged-sheet tray can holds various size of recording sheets, from a small size sheet to a large size sheet. The large sheet size herein means an A3 size or a B4 size, and the small size means a calling card size or a postcard size. A discharged-sheet tray is known which is folded up into the main body wall when the image is formed on a recording sheet of a small size or the image forming-apparatus is not used.\nMany image-forming apparatuses have a flat side face having a depression formed on a part of the side walls of the upper and lower main bodies for fitting the manual feeder tray. The depth of the depression is approximate to the thickness of the manual feeder tray for compactness of the image-forming apparatus.\nHowever, with such an insufficient depth of the depression, a part of the folded manual feeder tray may come to be inclined into the main body when the upper main body portion is opened. In this state, if the upper main body portion is brought down to close, the part of the manual feeder tray can be broken by collision against the upper main body portion owing to dimensional variations of the manual feeder tray or variation in the assemblage thereof. Such a damage may occur with the discharged-sheet tray.\nUnder the aforementioned circumstances, the object of the present invention is to provide an image-forming apparatus which does not cause damage of a manual feeder tray or a discharged"} +{"output_text": " up cause the filament to break down.\nIn an effort to overcome the limitations of incandescent bulbs, fluorescent bulbs were developed. Fluorescent bulbs are typically formed of a glass tube containing a small amount of mercury and a phosphor coating on the inside of the glass tube. When an electric current is passed through the tube, the mercury vaporizes and emits ultraviolet light. The phosphor coating on the inside of the glass tube absorbs the ultraviolet light and re-emits visible light. The visible", "input_text": " activated manually. Architects are commonly employed to assist not only with a floor plan of physical spaces, but also with the proper selection and layout of lighting to best complement the floor plan and usage of each space within a building. As may be appreciated, illumination of a space is determined at the time of production of blueprints, in anticipation of construction. The illumination that has been chosen for a space is essentially fixed during building construction. Changes may be made later, but not without substantial additional expense that will, for exemplary purposes, often include removal of parts of or entire walls, with the accompanying disruption of the space. Often the space is unavailable for use during the entire duration of a remodeling project.\nFurther complicating the issue of illumination is the type of light bulb that may be most appropriate for a space or location. Original electric light bulbs were incandescent. With sufficient electrical energy, which is converted to heat within an incandescent bulb filament, the filament will emit visible light. This is similar to a fire, where with enough heat, visible light is produced. As might also be appreciated though, incandescent bulbs produce far more heat than light. The color of the light from these bulbs is also most commonly quite yellow, casting a warm hue at a color temperature typically in the vicinity of 3,000 degrees Kelvin. Warm hues are often prized in relaxed settings such as those of a living room or dining room, more closely resembling gentle candle light. However, in contrast thereto, work and study environments are more preferably illuminated with light of more blue content, more closely resembling daylight with color temperatures of approximately 6,000 degrees Kelvin. Daylight color temperatures are not practically obtained using an incandescent bulb. In addition, these incandescent bulbs have only a few thousand hour life expectancy, even with more than a century of improvements, because the extreme temperatures required for the filament to light"} +{"output_text": " the formation is measured by injecting a current into the formation and measuring the voltage drop across the formation. The current is injected into the formation by means of a current source and the voltage drop across the formation is measured by means of a current measuring device. The current source and current measuring device are typically located at the surface of the earth. The current source may be a current generator such as a current generator that is driven by a current source power supply. The current source power supply may be a battery or", "input_text": " xe2x80x9cpropagation resistivityxe2x80x9d or xe2x80x9cwave resistivityxe2x80x9d tools, and they operate at frequencies high enough that the measurement is sensitive to the dielectric constant under conditions of either high resistivity or a large dielectric constant. See for example U.S. Pat. Nos. 4,899,112and 4,968,940. In MWD applications, resistivity measurements may be used for the purpose of evaluating the position of the borehole with respect to boundaries of the reservoir such as with respect to a nearby shale bed. The same resistivity tools used for LWD may also used for MWD; but, in LWD, other formation evaluation measurements including density and porosity are typically employed.\nFor purposes of this disclosure, the terms xe2x80x9cresistivityxe2x80x9d and xe2x80x9cconductivityxe2x80x9d will be used interchangeably with the understanding that they are inverses of each other and the measurement of either can be converted into the other by means of simple mathematical calculations. The terms xe2x80x9cdepth,xe2x80x9d xe2x80x9cpoint(s) along the borehole,xe2x80x9d and xe2x80x9cdistance along the borehole axisxe2x80x9d will also be used interchangeably. Since the borehole axis may be tilted with respect to the vertical, it is sometimes necessary to distinguish between the vertical depth and distance along the borehole axis. Should the vertical depth be referred to, it will be explicitly referred to as the xe2x80x9cvertical depth.xe2x80x9d\nTypically, the electrical conductivity of"} +{"output_text": " a continuing effort to increase the density of devices on integrated circuits. This is accomplished by reducing the size of the devices and by reducing the separation between devices. As the device size is reduced, the size of the active region on the surface of the semiconductor wafer is also reduced. The active region is the region of the semiconductor wafer where the devices are fabricated. The active region is typically defined by a device isolation structure.\nThe device isolation structure is typically formed by a local oxidation of silicon (LOCOS", "input_text": " Foreign Agent 10. Foreign Agent 10 then strips the encapsulation and forwards the message to Mobile Node 6 on sub-network 14. The packet forwarding mechanism implemented by the Home and Foreign Agents is often referred to as \u201ctunneling.\u201d\nIn addition to providing connectivity to a mobile node, it may be desirable to provide for the mobility of one or more networks moving together, such as on an airplane or a ship. RFC 2002 section 4.5 discusses the possibility of implementing mobile routers.\nIf Mobile Node 6 (or a mobile router) continues to move, it will receive advertisements from other foreign agents. In general, if Mobile Node 6 receives an advertisement from a new foreign agent, Mobile Node 6 will \u201ctear down\u201d its tunnel connection with Foreign Agent 10 and establish a new connection with the new foreign agent.\nHowever, the new foreign agent may not be the optimal foreign agent with which to establish a connection. For example, Mobile Node 6 may only receive advertisements from the new foreign agent for a brief period of time, as Mobile Node 6 travels in and out of the range of advertisements from the new foreign agent. This situation may be further complicated if Mobile Node 6 travels in and out of the range of advertisements from multiple foreign agents, or simply remains within overlapping ranges of multiple foreign agents. Under such circumstances, Mobile Node 6 would continue to tear down the established tunnel and create a new tunnel to the foreign agent from which Mobile Node 6 most recently received an advertisement. The process of tearing down and re-establishing a tunnel may take several seconds. Conventional mobile nodes and mobile routers do not have the ability to determine the optimal foreign agent with which to establish and maintain a connection under such circumstances. The present invention relates generally to lithography and more particularly relates to a system and method for measuring films associated with lithography processes or other type semiconductor fabrication processes.\nIn the semiconductor industry there is"} +{"output_text": " localized heating can cause the material to melt, and the material will then flow. This is the mechanism by which the pillar 2050 can be melted and then refilled with the amorphous phase-change material.\nThe material will then act as a low-resistance conductor, and the device will act as a resistor. The device will therefore act as a memory cell.\nThe material will also act as a heater, and the device will act as a heater. The device will therefore act as a programmable", "input_text": " the voltage drop will appear across the high-resistivity zone 2070 (if present). If sufficient voltage is applied, breakdown will occur across the high-resistivity zone. In this state the material will become very conductive, with large populations of mobile carriers. The material will therefore pass current, and current crowding can occur near the top of the pillar 2050. The voltage which initiates this conduction is referred to as the \u201csnapback\u201d voltage, and FIG. 2C shows why.\nFIG. 2C shows an example of instantaneous I-V curves for a device like that of FIG. 2A, in two different states. Three zones of operation are marked.\nIn the zone 2200 marked \u201cREAD,\u201d the device will act either as a resistor or as an open (perhaps with some leakage). A small applied voltage will result in a state-dependent difference in current, which can be detected.\nHowever, the curve with open circles, corresponding to the amorphous state of the device, shows some more complex behaviors. The two curves show behaviors under conditions of higher voltage and higher current.\nIf the voltage reaches the threshold voltage Vth, current increases dramatically without any increase in voltage. (This occurs when breakdown occurs, so the phase-change material suddenly has a large population of mobile carriers.) Further increases in applied voltage above Vth result in further increases in current; note that this upper branch of the curve with hollow circles shows a lower resistance than the curve with solid squares.\nIf the applied voltage is stepped up to reach the zone 2150, the behavior of the cell is now independent of its previous state.\nWhen relatively large currents are applied, localized heating will occur at the top of the pillar 2050, due to the relatively high current density. Current densities with typical dimensions can be in the range of tens of millions of Amperes per square cm. This"} +{"output_text": " magnetic device. This voltage is proportional to the strength of the magnetic field at the location of the coils. The voltage induced in the coils is used to determine the position of the magnetic device relative to the disk.\nThe magnetic device is typically flown over the surface of the disk at a constant velocity. The magnetic device is flown over the disk at a constant velocity because the disk is spinning at a constant velocity. The magnetic device is flown over the disk at a constant velocity because the disk is spinning at", "input_text": "N connection carrying all types of traffic. In order to stay connected to the LTE network, even in the case the single PDN Connection is moved to the WLAN, a solution could be to setup a \u201cdummy PDN connection\u201d to the LTE network. Several alternatives exist on when to setup the dummy PDN connection. This could, for example, be done when the UE first connects to LTE, where the dummy PDN connection is never released, or when the UE sets up the dummy PDN connection just before the ordinary PDN connection is handed over to the WLAN. The dummy PDN Connection can then be released when the ordinary PDN connection has returned to the LTE network.\nHaving a dummy connection is not preferred, for a number of reasons. First, the dummy connection generates control signaling upon initial setup and upon intra-LTE handover. Second, the dummy connection takes resources in the involved network nodes (e.g., memory state). Basically, rotating memory includes at least one disk capable of storing magnetic data. A magnetic device that includes a gap typically is flown over the surface of the magnetic disk. Current is passed through coils in the magnetic device to produce magnetic lines of flux at the gap of the magnetic device which in turn magnetizes portions of the disk surface. An actuator arm includes the magnetic device and is used to move the magnetic device to various positions over the surface of the disk.\nThe magnetic device is also used to sense the magnetized portions of the disk. This is commonly called reading the data from the disk. The actuator arm moves the magnetic device to a selected area of interest that contains data needed for a particular computation by a computer. The magnetized portion of the disk produces flux lines or a magnetic field near the surface of the disk. As the magnetic device is flown or passed near the surface of a spinning disk, a voltage is induced within the coils of the"} +{"output_text": " of success.\nThe present invention is directed to a method and apparatus for determining the location of a surgical instrument within a body cavity. More particularly, the invention is directed to a method and apparatus for determining the location of a surgical instrument within a body cavity, such as the abdominal cavity, using a plurality of sensors.\nIn the course of a surgical procedure, it is often necessary to determine the location of a surgical instrument within a body cavity. For example, in the course of a laparoscopic surgical", "input_text": "\nThe procedure carried out in the course of RIGS-based colorectal surgery involves, inter alia, a radionuclide survey of the lymph system and organs within the peritoneal cavity. Where a lymph node has been identified by the surgeon in the course of such survey by its association with a radiolabel in the course of surgery, it will be resected and immediately delivered to a tumor pathologist for intraoperative consultation. For this consultation, the pathologist typically carries out a somewhat standard technique which involves a sampling of the lymph node or tissue received, freezing, cutting of sections in a crystal, staining of those sections with hematoxylin-eosin or an equivalent stain, and examination under a microscope. Ideally, this procedure takes about five minutes per specimen, although extra time is allowed if multiple sections of specimens are to be examined. See in this regard, Cancer, Principles & Practice of Oncology, 4th Ed., vol. 1, p. 235, J.B. Lippincott Company, Philadelphia.\nBecause of the high sensitivity of the RIGS system, lymph node involvement may be identified at very early stages of colorectal cancer metastisis. This sensitivity may be occasioned by a form of biological amplification occurring wherein the radiolabeling system serves to identify sialomucin, a substance secreted by cancer involved cells, as opposed to the cells themselves. As a consequence, involved lymph nodes found positive by a radionuclide survey in the course of surgery which are delivered to the tumor pathologist may contain only a limited number of cancerous cells. Severely constrained by the time limitations of interoperative consultation, the pathologist often will not section a sample at the correct position and thus reports the resultant negative analysis to the surgeon. As is apparent, a technique is called for to aid the tumor pathologist in determining the proper location upon the specimen for carrying out sectioning with the highest probability"} +{"output_text": " lead dislodgement, and lead fracture. Lead dislodgement is a common problem that occurs when the lead tip becomes dislodged from the endocardial tissue. Lead dislodgement can occur for a variety of reasons including the formation of fibrous scar tissue over the lead tip, the formation of a blood clot about the lead tip, and the formation of a thrombus within the lead body.\nIn the past, the only way to remove a dislodged lead was to surgically remove the lead", "input_text": " pace the heart via electrical impulses and sense cardiac depolarizations.\nIn the past, various types of transvenous endocardial leads have been introduced into different chambers of the heart including the right ventricle, right atrial appendage and atrium as well as the coronary sinus. These leads usually are composed of an insulator sleeve that contains a coiled conductor having an electrode tip attached at the distal end. The electrode tip is held in place within the trabeculations of endocardial tissue. The distal ends of many available leads include passive fixation designs that may consist of flexible tines, wedges, or finger-like projections that extend radially outward and usually are molded from and integral with the insulator sleeve of the lead. These tines allow better containment by the trabeculations of endocardial tissue and help prevent dislodgement of the lead tip. Active fixation leads, on the other hand, are designed with lead tips that are lodged into the myocardium and may consist of helical coils, small sharp tips, and barbed tines among others.\nOnce an endocardial lead is implanted within a chamber of the heart, the body's reaction to its presence furthers its fixation within the heart. Specifically, shortly after implant, a blood clot forms about the lead tip due to enzymes released in response to the irritation of the endocardial tissue caused by the electrode tip which can be any one of a plurality of designs, tined, helical, flanged, among others. Over time, fibrous scar tissue eventually forms over the distal end. This scarring usually occurs within three to six months of implantation. In addition, fibrous scar tissue often forms, in part, over the lead's body or insulative sleeve within the vein through which the lead was passed.\nAlthough the state of the art in pacemaker and lead technology has advanced considerably, endocardial leads nevertheless occasionally fail, due to a variety of reasons, including insulation breaks,"} +{"output_text": "x80x98Mode2xe2x80x99 from xe2x80x98Mode1xe2x80x99, the direction of the TV set position is identified as a xe2x80x98Dxe2x80x99 direction, so that the microprocessor 15 outputs the earth magnetic field correction signals S3 and S4 of high voltage. Then, the transistors Q2, Q4, Q5 and Q7 of the earth magnetic field correction unit 17 are", "input_text": "\nMeanwhile, in case that the direction of the TV set position is identified as xe2x80x98Axe2x80x99 direction and xe2x80x98Modelxe2x80x99 is selected, since the microprocessor 15 outputs only the earth magnetic field correction signal S3 of high voltage, the transistors Q4, Q5 and Q7 of the earth magnetic field correction unit 17 are turned on, so that a little current flows in the xe2x80x98Axe2x80x99 direction (that is, B+xe2x86x92R7xe2x86x92Q5xe2x86x92L2xe2x86x92Q3xe2x86x92R5), thereby correcting the degaussed state.\nIf the earth magnetic field correction mode is changed to be set by xe2x80x98Mode2xe2x80x99 from xe2x80x98Mode1xe2x80x99, the direction of the TV set position is identified as a xe2x80x98Cxe2x80x99 direction, so that the microprocessor 15 outputs the earth magnetic field correction signals S3 and S4 of high voltage. Then, the transistors Q2, Q4, Q5 and Q7 of the earth magnetic field correction unit 17 are turned on, so that the current flowing through the resistance R5 also flows through the resistance R5 and the transistor Q2, thereby increasing the amount of the current (the flow of current: B+xe2x86x92R7xe2x86x92Q5xe2x86x92L2xe2x86x92Q4xe2x86x92R6xe2x86x92Q2).\nIf the earth magnetic field correction mode is set by xe2"} +{"output_text": " mental disorders is not required to use the Virtual Therapy method.\n2.m. The Virtual Therapy method is a novel combination of prior art. The prior art of virtual reality, identifying computer technology, graphic displays, hand-held-grip, and graphics (e.g. military applications, flight simulation, NASA COSTAR Mission to repair the hubble telescope) was not enough to suggest application of the individual or collective components for psychiatric treatments.\n2.n. The Virtual Therapy method demonstrates", "input_text": ", R., 1997. Virtual Therapy, supra). Virtual Therapy currently utilizes 3D immersion technology, including a head mounted display. As technological innovations advance with the concurrent building of learning principles into virtual environments (for therapeutic change), the delivery of this information through visual sensory input may take varied forms. For example, the visual display may be attached to a phone so that remote access to virtual environments may occur at home, in the office, or in public areas. Cellular technology, combined with a visual display, increases the opportunity to influences conscious processes at remote sites. Virtual Therapy may use video in two dimensions or video in three-dimension immersion using a head-mounted display.\n2.k. Prior-art references would not operate in combination. The prior-art of virtual reality, identifying computer technology, graphic displays, hand-held-grip, and graphics (e.g. military applications, flight simulation, NASA COSTAR Mission to repair the hubble telescope) was not enough to suggest application of the individual or collective components for psychiatric treatments.\n2.l. The Virtual Therapy method demonstrates that it is an inventive combination of prior art. These include but are not limited to computer technologies that produce graphics (SGI Machines, Division ProVision 100, Pixel Plane Technology), head-mounted displays (Virtual Research Flight Helmut, Division, Eyegen 3, Stereo Graphics Crystal Eyes), hand-held grips (Division Joystick and Logiteck 3D), and software support (Division, DVS) to produce stereo image generation, binaural audio synthesis, collision detection, and integration of a range of peripheral devices such as gloves and head-mounted display systems. Authoring software (Division, dVISE) can be used by non-programmers to import objects for the purpose of building and modifying virtual environments. In addition, knowledge of assessment and treatment of"} +{"output_text": " of its constituent elements selectively reduced whereby the superconductor as a whole has a superconducting carrier concentration such that it is held doped overly or doped optimally with superconducting carriers.\nThe present invention further provides a selective reduction type, high temperature superconductor that has on each of an upper and a lower surface of a unit lattice thereof a charge supply layer having each of its constituent elements selectively reduced whereby the superconductor as a whole has a superconducting carrier concentration such that it is held doped overly or doped optimally with", "input_text": " of the type described by doping with positive holes to raise the carrier concentration in a reduction process conditioned under low partial pressure or vacuum. Since it has been found impossible to increase the concentration of positive holes by reduction, i.e., by lowering the oxygen partial pressure, the conventional high temperature superconductors have the problem that they have a limited carrier concentration and are thus low and unsatisfactory in their superconducting properties that include the critical temperature Tc, critical current density Jc, irreversible magnetic filed Hirr. It has therefore been sought to, solve the problem of bringing into realization a high temperature superconductor of a reduced oxygen concentration.\nWith these problems taken into account, it is accordingly a first object of the present invention to provide a selective reduction type, high temperature superconductor that permits doping with positive holes by selectively reducing constituent elements (atoms).\nAnother object of the present invention is to provide a method of making a selective reduction type, high temperature superconductor.\nIn order to achieve the first object mentioned above, there is provided in accordance with the present invention, a selective reduction, high temperature superconductor, wherein it has a portion of its constituent elements selectively reduced whereby it has a superconducting layer thereof doped with positive holes.\nThe present invention also provides a selective reduction type, high temperature superconductor that has a portion of its constituent elements selectively reduced whereby there are formed in superconducting layers a first and a second region doped overly and doped optimally with superconducting carriers, respectively.\nThe present invention further provides a selective reduction type, high temperature superconductor that has a portion of its constituent elements selectively reduced whereby the superconductor as a whole has a superconducting carrier concentration such that it is held doped overly or doped optimally with superconducting carriers.\nThe present invention also provides a selective reduction type, high temperature superconductor that has on each of an upper and a lower surface of a unit lattice thereof a charge supply layer having each"} +{"output_text": " crystalline polypropylene and the bromobutyl rubber.\nU.S. Pat. No. 4,935,471 discloses a TPO composition having excellent mechanical strength, thermal stability, moldability, gas impermeability and damping characteristics. The TPO of the '471 application includes a crystalline polypropylene as a matrix and two elastomers: a bromobutyl rubber and an olefin copolymer rubber such as EPM or EPDM rubber. The composition also includes conventional additives such", "input_text": " the like.\nU.S. Pat. No. 4,480,074 discloses DVA compositions said to exhibit improved surface characteristics and fabricability wherein the compositions are prepared by blending an unvulcanized, but vulcanizable, monoolefin rubber with a blend containing cured polyolefin rubber with crystalline polyolefin and subsequently vulcanizing such that the final blend comprises about 15-45 parts by weight of crystalline polyolefin and 85-55 parts by weight of vulcanized rubber. EPDM is taught as both the vulcanized polyolefin rubber and the unvulcanized but vulcanizable rubber in the disclosed blends. Dynamic vulcanization utilizing peroxide cure systems, phenolic resin systems, phenylene-bismaleimide and diamine curatives, etc., is disclosed.\nJapanese patent application 85,530/87 discloses a TPO composition having excellent mechanical strength, thermal stability, moldability, gas impermeability and damping characteristics. The TPO of the '530 application includes a crystalline polypropylene as a matrix and two elastomers: a bromobutyl rubber and an olefin copolymer rubber such as EPM or EPDM rubber. The composition also includes conventional additives such as process oil. All of the components are combined and vulcanized in a single batch with a peroxide cure system but there is no indication of the inclusion of a peroxide co-agent such as m-phenylene bismaleimide (HVA-2) or the like. The '530 application's inventors found that while butyl and chlorobutyl rubbers are not cross-linkable with peroxide cures, bromobutyl rubbers are. Moreover, the '530 application's inventors explain that the enhanced physical properties claimed are due to the olefin copolymer rubber which provides flexibility to the TPO and also acts as a binder at the interface between the"} +{"output_text": " the group consisting of ethynyl, 1-propynyl, 2-propynyl, 1-butynyl, 2-butynyl, 3-butynyl, 1-pentynyl, 2-pentynyl, 3-pentynyl, 4-pentynyl, 1-hexynyl, 1-heptynyl, and 1-octynyl, which can be optionally substituted by 1, 2, or 3 substituents independently selected from the", "input_text": "3)3, \u2014N(CH3)2, \u2014N(C2H5)2, \u2014N(CH3)\u2014(C2H5), \u2014OCF3, and \u2014SCF3.\nIn another preferred embodiment, C2-10 alkenyl radicals are selected from the group consisting of vinyl, 1-propenyl, 2-propenyl, 1-butenyl, 2-butenyl, 3-butenyl, 2-methylpropen-1-yl, 3-methylbut-2-en-1-yl, (3,3)-dimethylbut-1-enyl, 2-methylbuten-2-yl, 1-pentenyl, 2-pentenyl, 3-pentenyl, 4-pentenyl, 1-hexenyl, 1-heptenyl, and 1-octenyl, which can be optionally substituted by 1, 2, or 3 substituents independently selected from the group consisting of F, Cl, Br, I, \u2014CN, \u2014NO2, \u2014OH, \u2014NH2, \u2014SH, \u2014O\u2014CH3, \u2014O\u2014C2H5, \u2014O\u2014CH(CH3)2, \u2014O\u2014C(CH3)3, \u2014S\u2014CH3, \u2014S\u2014C2H5, \u2014S\u2014CH(CH3)2, \u2014S\u2014C(CH3)3, \u2014NH\u2014CH3, \u2014NH\u2014C2H5, \u2014NH\u2014C(CH3)3, \u2014N(CH3)2, \u2014N(C2H5)2, \u2014N(CH3)\u2014(C2H5), \u2014OCF3, and \u2014SCF3.\nPreference is also given to C2-10 alkynyl radicals selected from"} +{"output_text": ": \nwherein the symbol * means an asymmetric carbon atom, which is obtained by asymmetrically hydrogenating an easily available 4-substituted oxy-3-oxobutyrate represented by formula III: \nwherein the symbol * means an asymmetric carbon atom, which is represented by formula IV: \nwherein the symbol * means an asymmetric carbon atom, which is represented by formula V: \nwherein the symbol * means an asymmetric carbon atom, which is represented by formula", "input_text": "Further, 3-hydroxy-xcex3-butyrolactone is water-soluble, and in any processes (1) to (3), washing with water is necessary at the stage of post-treatment after the reaction is finished, thus making the procedure troublesome and often lowering the yield, and therefore these cannot be said to be efficient processes.\nAccordingly, it cannot be said from an economical viewpoint and in respect of efficiency that the prior art processes are industrially suitable production processes, and there is demand for development of an industrially suitable process for producing optically active 3-hydroxy-xcex3-butyrolactone.\nThe object of this invention is to provide a novel process for producing optically active 3-hydroxy-xcex3-butyrolactone in a short step, which is superior economically and in efficiency and industrially suitable by using a starting material which is inexpensive and easily available and reagents easy to handle.\nUnder these circumstances, the present inventors made an extensive study for solving the object described above. As a result, they found that an optically active 4-substituted oxy-3-hydroxybutyrate obtained by asymmetrically hydrogenating an easily available 4-substituted oxy-3-oxobutyrate is hydrogenated in the presence of a heterogeneous hydrogenation catalyst and an acidic substance followed by deprotection and simultaneous ring closure thereof, whereby optically active 3-hydroxy-xcex3-butyrolactone of high optical purity can be obtained in high yield, and this invention was thereby completed.\nThat is, this invention relates to a process for producing optically active 3-hydroxy-xcex3-butyrolactone represented by formula I: \nwherein the symbol * means an asymmetric carbon atom, which comprises hydrogenating an optically active 4-substituted oxy-3-hydroxybutyrate represented by formula II"} +{"output_text": " the gyroscope is equipped with a pair of counter-rotating disks that are connected by a thin rod. The disks are made of a material with a half-integer spin, and the rod is made of a material with a half-integer spin. The gyroscope is designed to be rotated at a constant angular velocity, and the rod is designed to be rotated at a constant angular velocity. The gyroscope is designed to be rotated at a constant angular velocity, and the rod is", "input_text": " literature. But none of the examples that follow are accepted as reproducible examples of gravitomagnetic induction; nor is there any prior art on a device to produce gravitomagnetic induction utilizing a head disk assembly.\nGyroscopes produce a force when twisted that operates \u201cout of plane\u201d and can appear to lift themselves against gravity. Although this force is well understood to be illusory, even under Newtonian models, it has nevertheless generated numerous claims of gravitomagnetic induction devices and any number of patented devices. Perhaps the best known example is a series of patents issued to Henry William Wallace, an engineer at GE Aerospace in Valley Forge, Pa., and GE Re-Entry Systems in Philadelphia. He constructed devices that rapidly spun disks of brass, a material made up largely of elements with a total half-integer nuclear spin. (A \u201ckinemassic field\u201d generator from U.S. Pat. No. 3,626,605: \u201cMethod and apparatus for generating a secondary gravitational force field\u201d.) He claimed that by rapidly rotating a disk of such material, the nuclear spin became aligned, and as a result created a \u201cgravitomagnetic\u201d field in a fashion similar to the magnetic field created by the Barnett effect.\nHayasaka and Takeuchi had reported weight decreases along the axis of a right spinning gyroscope. Tests of their claims by Nitschke and Wilmath yielded null results. A few years later, recommendations were made to conduct further tests. Provatidis and Tsiriggakis have proposed a novel gyroscope equipped by couples of rotating mass particles that draw only the upper (or lower) 180 degrees of a circle, thus producing net impulse per full revolution. This is achieved by transforming the previously used circular orbit into a figure-eight-shaped path (symbol of infinity) of variable curvature that entirely lies on the surface of a hemisphere. Moreover,"} +{"output_text": "ft yarns are known. The weft yarns are guided in a transport chain and are transferred to a weft layer. The weft layer is guided over a guide element. The guide element has a guide surface for the weft yarns. The guide surface is formed by a plurality of guide elements. The guide elements are arranged in a line next to one another parallel to the transport chain. The transport chain has two rows of hooks arranged spaced at a distance from one another. The guide", "input_text": " bands is guided diagonally over the transport chains with the aid of a weft layer or a diagonal layer. The guide elements for the fiber bands are aligned perpendicular to the movement direction of the weft layer and arranged in a line next to one another parallel to the transport chains. The transport chains have two rows of hooks arranged spaced at a distance from one another. The guide hooks are located adjacent to the fiber arrangement. They have perpendicular needles closely adjacent to one another with the tip pointing upwards. Outside this row of guide hooks there is another row with retainer needles pointing upwards and outwards. These are likewise arranged very densely.\nThe guide elements on the weft layer or diagonal layer are vertically fixed. In the direction change phase a so-called fold tensioner is inserted behind the guide element of the weft layer, which fold tensioner guides the upper and lower strands of the direction change fold separately from one another at the apex of the same until both strands are transferred to the row of guide hooks again after racking of the upper strand is completed by a racking grid swung in from above. While racking is executed, the fibers of the direction change fold are stretched and collected by a so-called loop tensioner and transferred to the row of retainer hooks in the form of a rope.\nDue to the large number of tools involved in the operation, this procedure requires a very high control expenditure. The desired effect, namely to achieve a really gap-free form of the fiber arrangement, is achieved only with reservations. The working speed remains limited and is unsatisfactory. With a change in the width of the fiber bands or with the change of alignment of the fiber band sheet between the transport chains, the work elements always have to be structurally adapted to the new conditions. The associated expense is high.\nWith DE 197 42 721 C1 a method and a device for laying and positioning we"} +{"output_text": "panol, and methoxybutanol.\nExamples of suitable regulators include butyl glycidyl ether, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol, allyl glycidyl ether, allyl alcohol", "input_text": " from 0.5 to 5% by weight, of component (a3),\n(a4) up to 50% by weight, preferably from 15 to 45% by weight, of component (a4),\n(a5) up to 50% by weight, preferably from 15 to 35% by weight, of component (a5),\n(a6) from 0.1 to 20% by weight, preferably from 1 to 15% by weight, of component (a6), and\n(a7) up to 30% by weight, preferably up to 25% by weight, of component (a7),\nthe sum of the weight fractions of components (a1) to (a7) being in each case 100% by weight.\nThe polyacrylate resins (A) used in accordance with the invention are prepared in an organic solvent or solvent mixture and in the presence of at least one polymerization initiator, and in the presence or absence of a regulator. Organic solvents, polymerization initiators, and regulators used are the solvents, regulators, and polymerization initiators that are customary for the preparation of polyacrylate resins. The solvents may participate in the reaction with the crosslinking component (B) and so act as reactive diluents.\nExamples of suitable solvents include butyl glycol, 2-methoxypropanol, n-butanol, methoxybutanol, n-propanol, ethylene glycol monomethyl ether, ethylene glycol monoethyl ether, ethylene glycol monobutyl ether, diethylene glycol monomethyl ether, diethylene glycol monoethyl ether, diethylene glycol diethyl ether, diethylene glycol monobutyl ether, trimethylolpropane, ethyl 2-hydroxypropionate, and 3-methyl-3-hydroxybutanol, and also propylene glycol-based derivatives, e.g., ethyl ethoxypropionate, isopropoxypro"} +{"output_text": " alkoxy, halogen, hydroxy, amino, nitro, cyano, carboxy, carboalkoxy, carboalkyl, alkylthio, alkylsulfinyl, alkylsulfonyl, alkylsulfonylamino, alkylamino, alkylamido, alkylamidoalkyl, alkylcarbonylamino, alkylcarbonyl, alkylcarbonyloxy, alkylcarbonylaminoalkyl, alkylcarbonyl, alkylcarbonyloxyalkyl, alkylcarbonylaminoalkyl, alkylcarbonyl, alkylcarbonyloxyalkyl, alkylcarbonylaminoalkyl, alkylcarbonyl,", "input_text": " wherein R.sup.14 is independently as defined above; PA1 (a) --(CH.sub.2).sub.x OR.sup.7 wherein R.sup.7 is H, C.sub.1 -C.sub.4 alkyl, C.sub.1 -C.sub.4 acyl, C.sub.3 -C.sub.6 cycloalkyl, phenyl, or benzyl, PA1 (b) --(CH.sub.2).sub.x NR.sup.7 R.sup.8 wherein R.sup.7 is independently as defined above and R.sup.8 is H, C.sub.1 -C4 alkyl, phenyl, benzyl, or C.sub.1 -C.sub.4 acyl PA1 (c) --(CH.sub.2).sub.x OCH.sub.2 R.sup.7 wherein R.sup.7 and x are as defined above, PA1 (d) --CHO PA1 (e) --CN, PA1 (f) --COOR.sup.9 wherein R.sup.9 is hydrogen, C.sub.1 -C.sub.4 alkyl or benzyl; PA1 (a) R.sup.14 -(CH.sub.2).sub.x - wherein x and R.sup.14 are, independently, as defined above, PA1 (b) R.sup.14 R.sup.13 CH(CH.sub.2).sub.y - wherein y is zero, one, two, three, four or five, R.sup.14 is as defined above, and R.sup.13 is lower alkyl, cycloalkyl, naphthyl, phenyl unsubstituted or substituted with from one through five substituents, preferably from one through three substituents, comprising alkyl,"} +{"output_text": " example, a typical digital system may operate at a 5V logic signal level, while a high voltage signal may be in the range of 12V to 20V. In order to interface a high voltage signal to a digital system, a level shifting circuit is typically used. A level shifting circuit is a circuit that converts a high voltage signal to a lower voltage signal that is compatible with a digital system. For example, a level shifting circuit may be used to interface a high voltage signal to a digital system", "input_text": " the normalized capacitance (C) of the gated diode pair 28 is nearly 1.8 whereas the normalized capacitance (C) of the STI diode pair 30 is approximately 1.0. This equates to the gated diode pair 28 having an approximately eighty percent (80%) increase in capacitance over the STI diode pair 30 in this example.\nIncreased perimeter capacitance in a gated diode increases the load capacitance when the gated diode is added to a protected circuit. Increasing load capacitance can negatively affect protected circuits. For example, increased load capacitance can decrease switching times and frequency performance of a protected circuit, because charging time will be increased due to the ESD protection circuit being coupled to the protected circuit in an R-C circuit arrangement. Further, increased capacitance provided as a result of inserting an ESD protection circuit can decrease the sensitivity of radio frequency (RF) components, such as a low noise amplifier (LNA). However, use of an STI diode having a lower capacitance in an ESD protection circuit also has a trade off over a gated diode. Use of an STI diode in an ESD protection circuit can result in low CDM voltage tolerances for the protected circuit for both positive and negative surges, and especially for protected circuits and related processes employing thin oxide gate oxide dielectric devices coupled to a pad that can be found in large SOC chips.\nTo preserve performance, chip manufacturers and customers have had to accept the lower CDM voltage tolerances provided by use of STI diodes in ESD protection circuits, which results in greater ESD-related exposure and failures. Thus, a need exists to provide an ESD protection circuit that exhibits superior conductance and turn-on time as well as a low capacitance so as to not adversely affect performance of a protected circuit. Many electronic systems utilize high voltage signals that exceed the 5V or less logic signal levels of typical digital systems. For"} +{"output_text": ".1 -K.sub.4, respectively. The output of multiplier 14.sub.1 is supplied to delay circuit 9, while the output of multiplier 14.sub.2 is supplied to delay circuit 10, and the output of multiplier 14.sub.3 is supplied to delay circuit 11, and the output of multiplier 14.sub.4 is supplied to delay circuit 12.\nThe output of delay circuit 9 is supplied to multiplier 14.sub.1, while the output of delay", "input_text": " conventional tone generating device suitable for simulating the sound generating mechanism of woodwind instruments is shown in the block diagram of FIG. 9, wherein an excitation signal generating circuit 1b can be seen similar to the excitation signal generating circuit 1a described above. In the case of the device shown in FIG. 9, the output signal of the excitation signal generating circuit 1b is supplied to a loop circuit via a subtracter 7 and an adder 8, both of which are components of the loop circuit. As FIG. 9 shows, immediately between subtracter 7 and adder 8, a nonlinear element 6 is included as a component of the loop circuit which simulates the nonlinear characteristics of a reed which is the sound generating element in the woodwind instrument under simulation. Subtracter 7 and adder 8 on either side of nonlinear element 6 simulate the application of air pressure to the reed in the instrument being simulated.\nDelay circuits 9 through 12 can be seen, each consisting of, for example, multiple stage shift registers. These delay circuits 9 through 12 simulate the delay of transmission of air pressure waves in tubular portions of the simulated wind instrument. Delay circuits 9 and 10 correspond to tubular portions of the instrument tubes nearest to the reed, while delay circuits 11 and 12 correspond to those farthest from the reed. Delay circuit 9 receives the output signal from adder 8, whereas delay circuit 10 supplies an input signal to subtracter 7 wherein the output of delay circuit 10 is subtracted from the output signal from excitation signal generating circuit 1b.\nA junction circuit 13 is incorporated into the loop circuit which simulates the scattering of air pressure waves caused by variations in the diameter of tubular portions of the woodwind instrument being simulated. In this junction circuit 13, a fourth order multiplier lattice is used which consists of multipliers 14.sub.1 -14.sub.4 having multiplication coefficients K.sub"} +{"output_text": " below.\nFIG. 4 is a graph showing a potential distribution in the space in the neck of the CRT. In FIG. 4, reference numeral 1 denotes a potential on the first grid G1; 2, a potential on the second grid G2; 3, a potential on the third grid G3; 4, a potential on the fourth grid G4; 5, a potential on the fifth grid G5; 6, a potential on the shield cup 12; 7, a potential", "input_text": "61 and 55-38484 and U.S. Pat. Nos. 3,932,786 and 4,413,298. However, there is no adequate space for the resistor unit to be arranged within the CRT. For this reason, the resistor unit is located in a small space in the neck 6 such that it is situated near the electron gun assembly 7.\nFIG. 2 is one form of an electron gun assembly having a resistor unit arranged in it. In an arrangement shown in FIG. 2, reference numeral 7 denotes electron gun assembly 10a, 10b, 10c (10b, 10c hidden from view in FIG. 2), heaters; 11a, 11b, 11c (11b, 11c hidden from view in FIG. 2), cathodes; G1, G2, G3, G4 and G5, first, second, third, fourth and fifth grids, respectively; 12, a shield cup; 13a, 13b, a pair of insulating support rods; 15, a spacer; 16, an inner conductive film and 17, a stem pin.\nIn the electron gun assembly 7, a resistor unit 14 is located at the back surface of the insulating support rod 13a.\nThe resistor unit 14 is formed as shown in FIG. 3. In the arrangement shown in FIG. 3, 18 denotes an insulating board; 19, a high resistance section; T1... T4, voltage pickup terminals; and CN, a connector.\nIf the resistor unit 14 is arranged in a narrow space in the neck 6 such that it is located near the electron gun assembly 7, a relatively complex potential distribution is created in the space in the neck of the CRT, which is caused by a potential on each electrode in the electron gun assembly 7 and on the inner conductive film 16. For this reason, a problem occurs as set out"} +{"output_text": " Biochemistry and Molecular Biology (IUBMB), the phosphoryl transferases are classified into five families: protein kinases, lipid kinases, nucleoside kinases, nucleotide kinases, and transferases. Protein kinases are further classified into two groups: serine/threonine kinases and tyrosine kinases. The serine/threonine kinases are further classified into two groups: protein kinase A (PKA) and protein kinase G (PKG). The tyrosine kinases are further classified into two groups: protein kinase C (", "input_text": " that these optical receivers must have high sensitivity to 1.3 \u03bcm wavelength light but substantially no sensitivity to 1.55 \u03bcm wavelength light.\nHowever, to achieve such a high selectivity ratio, conventional APDs must be provided with a wavelength filter, as described below.\nReferring to FIG. 12, the bandgap wavelength of the InGaAs light absorption layer 177 is 1.67 \u03bcm, and that of the InP window layer 174 is 0.92 \u03bcm. Therefore, this APD has high sensitivity to a wide range of wavelengths, from 0.92 \u03bcm to 1.67 \u03bcm, which means that the APD has approximately the same sensitivity to 1.3 \u03bcm and 1.55 \u03bcm wavelengths. As a result, the APD cannot receive the shorter wavelength 1.3 \u03bcm without receiving the longer wavelength 1.55 \u03bcm unless it is provided with a wavelength filter.\nFurther, as described above, although photodetector devices for selectively receiving the longer wavelength light have been available, there is no known photodetector device capable of selectively receiving the shorter wavelength light. The invention relates to inhibitors of enzymes that catalyze phosphoryl transfer and/or that bind ATP/GTP nucleotides, compositions comprising the inhibitors, and methods of using the inhibitors and inhibitor compositions. The inhibitors and compositions comprising them are useful for treating or modulating disease in which phosphoryl transferases, including kinases, may be involved, symptoms of such disease, or the effect of other physiological events mediated by phosphoryl transferases, including kinases. The invention also provides for methods of making the inhibitor compounds and methods for treating diseases in which one or more phosphoryl transferase, including kinase, activities is involved.\nPhosphoryl transferases are a large family of-enzymes that transfer phosphorous-containing groups from one substrate to another. By the conventions set forth by the Nomenclature Committee of the International Union of"} +{"output_text": ", such as a hydrogel.\nYet another embodiment of the invention concerns a cooling device that includes a cooling chamber that is defined by a base sheet and a cover sheet. The cover sheet is attached to the base sheet by a plurality of connecting membranes. The connecting membranes are attached to the base sheet and the cover sheet by a plurality of attachment members. The attachment members are located in a plurality of attachment holes defined in the base sheet and the cover sheet. The attachment members are preferably made of a flexible", "input_text": ", each of which spans and interconnects the cooling chambers. Air enters the chambers through an air inlet in the inflatable structure, and then exits the chambers toward the person through numerous exhaust holes located in the base sheet. After being heated by the person\"\"s body, this air can exit the inflatable structure through evaporation openings in the connecting membranes.\nIn a different embodiment, upper and base sheets may be adhered in different locations to provide a cooling device with a cooling chamber that forms a continuous a serpentine path. The serpentine cooling chamber is held in this configuration by connecting membranes between neighboring segments of the path. In this embodiment, ventilating cross-members are unnecessary because all regions of the cooling chamber are in fluid communication with each other. After air enters the chamber through an air inlet in the structure, the air exits the chamber toward the person through numerous exhaust holes located in the base sheet. After being heated by the person\"\"s body, this air can exit the inflatable structure through evaporation openings defined in the connecting membranes.\nAnother embodiment of the invention concerns a generally rectangular cooling device that includes certain body-conforming features. Namely, the device has a number of body-contour slits extending inward from the perimeter. Due to the slits\"\" locations, they permit the inflatable structure to conform to a person\"\"s legs and outstretched arms.\nStill another embodiment of the invention concerns an inflatable cooling device that includes an evaporative cooling layer. This layer comprises a sheet of absorbent material capable of holding a substantial amount of water. The absorbent sheet is placed in thermal contact with the person\"\"s skin, saturated with water, and then evaporatively cooled by the overlying thermal cooling device. As this layer cools (by evaporation), it has the effect of cooling the person (by conduction). The absorbent sheet may advantageously comprise a super-absorbent material"} +{"output_text": " the hair follicle, leading to a temporary alopecia.\nAlopecia areata is a non-scarring, non-infectious, non-inflammatory, non-neoplastic, non-immunologic, non-endocrinologic, non-autoimmune, non-toxic, non-metabolic, non-neoplastic, non-inflammatory, non-infectious, non-immunologic, non-endocrinologic, non-autoimmune, non", "input_text": " immunologic alopecia characterized by the abrupt onset of sharply defined areas of hair loss. In the most severe cases, the scalp will develop total hair loss (alopecia totalis) or the hair loss will involve the whole body surface (alopecia universalis). Most of the patients will run an unpredictable and relapsing course with multiple episodes of hair loss and regrowth. Only about 20 to 30 percent will have a single reversible episode. Regrowth of hair is common within several months, but in many instances is not complete, and relapses are common. Alopecia areata may be associated with autoimmune diseases such as vitiligo, pernicious anemia, collagen disease, and endocrinopathies.\nTraumatic alopecia is induced by physical trauma, of which the two most important groups, from the therapeutic standpoint are trichotillomania and alopecia resulting from cosmetic procedures or improper hair care. Trichotillomania is a compulsive habit in which the individual repeatedly pulls or breaks off his or her own hair in a partially conscious state similar to thumb sucking or nail biting. Traumatic alopecia from cosmetic procedures is done consciously in ill-advised individuals and is almost exclusively seen among females. Sometimes this type of alopecia is associated with folliculitis induced by the occlusive effect of the oily cosmetics used in the procedure.\nAnagen effluvium is a temporary alopecia caused by the inhibition of mitosis in the hair papilla by certain cytotoxic drugs, leading to constriction of the hair shaft or to complete failure of hair formation. In particular, alopecia frequently occurs in cancer patients who are treated with chemotherapeutic drugs such as cyclophosphamide (CY) and/or irradiation. U.S. Pat. No. 5,962,523 Such agents damage"} +{"output_text": " processing code including computer code for routing the packet to an appropriate output interface queue; and computer code for storing the packet in an intermediate data structure to await full processing if the packet is determined not to be delay-sensitive.\nA fourth specific embodiment of the present invention provides a method for routing traffic in a packet-switched, integrated services network which supports a plurality of different service classes. The network includes at least one router having at least one input interface and at least one output interface. The method", "input_text": " level of the packet is at least priority P, the packet is fully processed, including routing the packet to an appropriate output interface queue. If, however, the associated priority level of the packet is less than priority P, the packet is stored in an intermediate data structure to await full processing.\nA second specific embodiment of the present invention provides a method for routing traffic in a packet-switched, integrated services network which supports a plurality of different service classes. The network includes at least one router having at least one input interface and at least one output interface. The method comprises preprocessing at least one packet from the input interface to determine if the packet is delay-sensitive. The preprocessing includes classifying the packet to determine an associated priority level of the packet. If the packet is determined to be delay-sensitive, it is immediately and fully processed, which includes routing the packet to an appropriate output interface queue. If the packet is determined not to be delay-sensitive, the packet is stored in an intermediate data structure to await full processing. The intermediate data structure is used for queuing packets which have been preprocessed, but which have not yet been processed sufficiently to be routed to an appropriate output interface queue.\nA third specific embodiment of the present invention provides a computer program product for routing traffic in a packet-switched integrated services network which supports a plurality of different service classes. The network includes at least one router. The router includes at least one input interface having at least one line input and at least one output interface. The computer program product comprises at least one computer useable medium having computer code embodied therein. The computer readable code comprises computer code for processing at least one packet from the input interface to determine if the packet is delay-sensitive, wherein the preprocessing code includes computer code for classifying the packet; computer code for fully processing the packet if the packet is determined to be delay-sensitive, the fully"} +{"output_text": " inks, it is possible to obtain images with a broad range of color reproducibility and a suppression of graininess.\nHowever, when recording is performed using an ink set comprising light and dark inks, the light inks are applied to the recording medium in a larger amount than the dark inks, and the dark inks are applied to the recording medium in a smaller amount than the light inks. Therefore, the dark inks are more likely to be absorbed by the recording medium than the", "input_text": " such a device, including the steering abilities, etc., the cost of manufacturing a disposable device would be extremely large, thus reducing the likelihood of its commercial acceptance. Ink jet recording methods are printing methods in which recording is performed by causing the jetting of small droplets of ink (ink composition), and causing these droplets to adhere to a recording medium such as paper or the like. Such methods are advantageous in that clear images with a high resolution can be printed at a high speed using a relatively simple apparatus. Ink sets used in such ink jet recording methods include ink sets comprising respective cyan (C), magenta (M) and yellow (Y) inks, ink sets in which a black (K) ink is added to these inks, and the like. For example, an ink set combining cyan, magenta and yellow inks which makes it possible to obtain good images, especially images with a good hue, in addition to possessing light resistance and water resistance, has been disclosed (Japanese Patent Application Laid-Open No. 10-120956).\nIn recent years, ink sets having light and dark inks which differ from each other in color density while being of the same color have been developed in order to realize both a broader range of color reproducibility and a suppression of conspicuous graininess when images are expressed by dots (a state in which the dots appear to be grainy when observed with the naked eye). For example, there are ink sets which have the four inks of C, M, Y and K as dark inks, and the four inks of light cyan (Lc), light magenta (Lm), light yellow (Ly) and light black (Lk) as light inks.\nBy performing recording while varying the amount of coloring material applied per unit area of the recording medium (i.e., varying the duty) using such an ink set comprising light and dark"} +{"output_text": "diameter segmented rotor assembly, the resultant rotor assembly will not be sufficiently rigid to prevent relative movement between the individual laminations.\nAccordingly, it is an object of the present invention to provide a segmented rotor assembly for a dynamoelectric machine that is capable of being fabricated at a relatively low cost and that is sufficiently rigid to prevent relative movement between the individual laminations.\nIt is another object of the present invention to provide a segmented rotor assembly for a dynamoelectric machine that is capable of being fabricated at", "input_text": " the non-vertical axis of the machine.\nIn fact, it has been found that in addition to the resultant so-called bicycle chain effect between the laminated rim and the spider on which it is mounted, further undesirable relative movement occurs in such machines between the individual rotor rim laminations. Specifically, this second type of movement involves sliding or skewing of the segmented rim laminations relative to one another due to the typically loose tolerances allowed in prior art rotor keying arrangements. Because of such loose tolerances and the types of key structures employed heretofore, each rim lamination segment is not held in direct contact with a key to prevent it from sliding or skewing relative to other rim laminations.\nA typical prior art procedure used to avoid skewing between adjacent rim laminations of segmented rotor assemblies of a kind normally fabricated on-site, requires the performance of relatively expensive machining operations by which the irregular sidewalls of keyways in the rim laminations are smoothed to within close tolerances of the width of associated keys. Accordingly, when the keys are positioned in the keyways they closely abut essentially all of the laminations and prevent relative movement between them. Of course, such machining operations make it necessary to provide a large vertically reciprocable planning tool at the often-remote sites where such relatively large diameter, segmented rim assemblies are normally fabricated, thus creating an undesirably high manufacturing cost that should preferably be avoided if possible.\nIt is also known in the prior art to manufacture relatively small-diameter dynamoelectric generators by heat-shrinking rotor laminations directly onto a shaft to secure them and prevent the above-mentioned looping or bicycle chain effect caused by centrifugal and gravitational forces brought to bear on the laminations when the rotors are turned at high speeds. As stated above though, it has been found that if the same type of heat shrinking fabrication methods are applied to make a large-"} +{"output_text": " of the invention is to provide novel collagens modified by grafting thiol functions, that are easy to use and to handle in the medical field.\nAnother essential objective of the invention is to provide novel collagens modified by grafting thiol functions, that are easy to use and to handle in the cosmetic field.\nAnother essential objective of the invention is to provide novel collagens modified by grafting thiol functions, that are easy to use and to handle in the pharmaceutical field.\nAnother", "input_text": " rheological characteristics and/or its biological characteristics, are moreover known. Thus, patent application PCT WO 90/05755 describes a collagen in which the amines of the lysine residues it comprises are substituted with a synthetic hydrophilic polymer chain and more particularly with monomethyl polyethylene glycol. This collagen-PEG is presented as having low immunogenicity and improved mechanical properties of elasticity and malleability.\nPatent application PCT WO 94/01483 discloses a biologically inert, biocompatible conjugated polymer material, formed by a natural polymer such as collagen, linked via an ether bond to a synthetic hydrophilic polymer such as polyethylene glycol (PEG).\nThe modified collagens according to the prior art do not afford all the desired satisfaction, as regards their mechanical properties, their in vivo degradation kinetics and their biological characteristics. Moreover, the known collagens modified with free or substituted thiol functions still have scope for improvement, as regards controlling, by means of the degree of crosslinking, their mechanical and biological characteristics.\nFinally, it would be advantageous for the crosslinkable forms of the known modified collagens to have solubility properties over a wide pH range, so as to make them easier to use, without this having a negative effect on their level of crosslinking.\nIn this prior art, one of the essential objectives of the invention is to provide novel collagens modified by grafting free or substituted thiol functions, these novel collagens needing to be capable of crosslinking in a sufficient and controlled manner, by forming intercatenary disulfide bridges.\nAnother essential objective of the invention is to provide novel collagens modified by grafting thiol functions and characterized by high degrees of grafting coexisting with good solubility over a wide pH range.\nAnother essential objective of the invention is to provide novel collagens modified by grafting thiol functions, that are easy to use and to handle industrially.\nAnother essential objective"} +{"output_text": " key can encrypt a message for the user. The user can decrypt the message with its private key. Anyone knowing the user's public key can send a message to the user. The user can decrypt the message with its private key. Anyone knowing the user's public key can send a message to the user. The user can encrypt the message with its private key.\nThe term \u201cmessage\u201d is defined as any information, including data, voice, video, text, etc.\n", "input_text": "Existing PCB and other interface technology has already failed to keep pace with current semiconductor and computer technology, and as computer and microprocessor speeds continue to climb, with space efficiency and routeability becoming increasingly important, multi-layer substrates having more efficient interconnect characteristics will be required. The PCBs discussed above fall short of current and contemplated semiconductor-related and computer-related requirements. Public key cryptosystems use a pair of asymmetric related keys, one for encryption and the other for decryption. Encryption in this context does not necessarily imply that the result is confidential, since data encrypted with a private key can be decrypted by anyone holding the public key\u2014which may be widely available. One of the key pair, the private key, is kept secret by the user, while the other key, the public key, can be publicly disclosed. The key pair must have the property that, given knowledge of the public key, it is infeasible to determine the private key.\nA user receives or, with suitable hardware or software, can generate for itself a pair of keys which are generally two large numbers. The user keeps one of these keys private and never discloses it. The other can be safely made public, just like a phone number or similar personal data. Due to the way the keys are generated, information encrypted with the private key can only be decrypted with the public key and vice versa. Using a key pair means that the sender and receiver do not need to share a secret key.\nPublic keys do not have to be published to the world. They can be shared as widely or narrowly as business and privacy requirements dictate.\nThe term \u201cuser\u201d is defined as any entity including individuals, groups of individuals, one or more individuals in a role, corporations, organisations, computer applications or systems, automated machines, etc.\nPublic key cryptography makes the following possible: Anyone knowing the user's public"} +{"output_text": " keyboard input.\nThe IME module of the operating system is a program that is loaded into memory when the operating system is started. The IME module is responsible for translating the user's keystrokes into the appropriate character for display on the screen. The IME module is also responsible for translating the user's keystrokes into the appropriate character for display on the screen. The IME module is also responsible for translating the user's keystrokes into the appropriate character for display on the screen", "input_text": " kiosk environments. A specific example of the IBM Consumer Device Services (CDS) product is the IBM Consumer Device Services for OS/2.RTM. which is a licensed program for operation on all Intel.RTM. architecture personal computer systems that support OS/2 and available from IBM. Further discussion and more details of the CDS product are contained in the subsequent description provided in this application with respect to a preferred embodiment of the subject invention.\nFor multi-byte character language inputs from a physical keyboard into a computer system, the operating system performs the handling of multiple keystrokes per character on behalf of an application program running on the computer system. The module of the operating system which performs this task is referred to as an input method editor (IME). Input method editors are also referred to as front end processors as the editor immediately manipulates the entered information to display the desired text on the screen. The IME module, or applet, of the operating system, allows the user to enter the thousands of different characters used in Far Eastern written languages such as Chinese, Japanese and Korean, using a standard 101-key keyboard. IMEs can be used when text is entered that doesn't involve typing each character directly and are widely used in operating systems for entering ideographs and other characters phonetically, or component by component, into computer systems. The user composes each character in one of several ways, including by radical, that is, a group of strokes in a character that are treated as a unit for the purpose of sorting, indexing and classification, by phonetic representation or by typing in the numeric codepage index of the characters, which is a standard index for characters of all national languages, promulgated by the International Standard Organization. IMEs are widely available and Windows.RTM. and OS/2 operating systems include an IME module with the operating system that handles physical"} +{"output_text": " and local area network connections along with high-resolution displays and associated image processing chips. Such devices can provide capabilities such as full internet connectivity, entertainment including full-resolution video, navigation, electronic banking and more, all in a pocket-size device. Complex portable devices require packing numerous chips into a small space. Moreover, some of the chips have many input and output connections, commonly referred to as \u201cI/Os.\u201d These I/Os must be interconnected with the I/Os of other chips.", "input_text": " catheter. The use of the sheath in conjunction with the pull cord may allow the cooling member to be easily manipulated and steered by moving the pull cord.\nThe cooling member may be comprised of an outer member disposed over an inner member. Coolant may be transferred to the inner member in order to cool the cooling member to a temperature appropriate for causing cold-induced necrosis, which may be appropriate for a particular medical procedure. Preferably, coolant may be sprayed onto the inner member in order to facilitate heat transfer between the cooling member and an area of interest.\nThe cooling member may further comprise an electrode and an electric lead. Alternatively, the cooling member may further comprise a pad printed conductive electrode having an electrical lead. According to this embodiment, the electrode may be used to determine the electrical activity of tissue at an area of interest.\nMultiple alternative embodiments of the cooling member are also disclosed. For example, the cooling member may further comprise a support member that may help to prevent the cooling member from rupturing. The cooling member may include a cryo balloon therapy chamber disposed on the mesh cage, the cryo balloon therapy chamber connected to a coolant source. The cryo balloon therapy chamber may further comprise a cryo balloon therapy ring. In another preferred embodiment, the cooling member may further comprise a heat exchange surface that may be slidable, a slidable and rotatable sprayer, or a cryo balloon therapy assembly. The demand for more compact physical arrangements of microelectronic elements such as integrated chips and dies has become even more intense with the rapid progress of portable electronic devices, the expansion of the Internet of Things, nano-scale integration, subwavelength optical integration, and more. Merely by way of example, devices commonly referred to as \u201csmart phones\u201d integrate the functions of a cellular telephone with powerful data processors, memory and ancillary devices such as global positioning system receivers, electronic cameras,"} +{"output_text": " the nucleic acids can be inserted into a vector by synthesizing the vector and inserting the DNAs into the synthesized vector. The synthesized vector is then used to transform a suitable host cell.\nThe nucleic acids of the invention can be used to produce the polypeptides of the invention by standard techniques. For example, the nucleic acids can be used to synthesize an oligonucleotide probe or primer which is then used to amplify the nucleic acid encoding the polypeptide.\nThe polypeptides of the invention can be used to produce", "input_text": " provides a library of P. aeruginosa-derived nucleic acid sequences. The libraries provide probes, primers, and markers which can be used as markers in epidemiological studies. The present invention also provides a library of P. aeruginosa-derived nucleic acid sequences which comprise or encode targets for therapeutic drugs.\nNucleic acids comprising any of the sequences disclosed herein or sub-sequences thereof can be prepared by standard methods using the nucleic acid sequence information provided in SEQ ID NO: 1-SEQ ID NO:16571. For example, DNA can be chemically synthesized using, e.g., the phosphoramidite solid support method of Matteucci et al., 1981, J. Am. Chem. Soc. 103:3185, the method of Yoo et al., 1989, J. Biol. Chem. 764:17078, or other well known methods. This can be done by sequentially linking a series of oligonucleotide cassettes comprising pairs of synthetic oligonucleotides, as described below.\nOf course, due to the degeneracy of the genetic code, many different nucleotide sequences can encode polypeptides having the amino acid sequences defined by SEQ ID NO: 16572-SEQ ID NO: 33142 or sub-sequences thereof. The codons can be selected for optimal expression in prokaryotic or eukaryotic systems. Such degenerate variants are also encompassed by this invention.\nInsertion of nucleic acids (typically DNAs) encoding the polypeptides of the invention into a vector is easily accomplished when the termini of both the DNAs and the vector comprise compatible restriction sites. If this cannot be done, it may be necessary to modify the termini of the DNAs and/or vector by digesting back single-stranded DNA overhangs generated by restriction endonuclease cleavage to produce blunt ends, or to achieve the same result by filling in the single-stranded termini with an appropriate DNA polymerase.\nAlternatively,"} +{"output_text": ", and more compact engine to be designed.\nThe linear gear drive system is also advantageous in that it is a relatively simple gear relationship which can be used to convert reciprocating linear motion into a balanced rotary motion.\nThe linear gear drive system is also advantageous in that it is a relatively simple gear relationship which can be used to convert reciprocating linear motion into a balanced rotary motion.\nThe linear gear drive system is also advantageous in that it is a relatively simple gear relationship which can be used to", "input_text": " the thickeners most commonly used for formulating these compositions with benzoyl peroxide are acrylic acid polymers (Carbomer) and celluloses alone or combined with silicates.\nNow, the use of carbomers in compositions of aqueous gel type does not give good results in terms of chemical stability of the benzoyl peroxide or in terms of rheological stability. As described by Bollinger (Bollinger, Journal of Pharmaceutical Science, 1977, vol 5), it has been observed that from 5% to 20% benzoyl peroxide is lost after 2 months at 40\u00b0 C. depending on the neutralizer of the carbomer used. Furthermore, the release of benzoic acid results in depolymerization of the carbomers, leading to a drop in viscosity which may result in phase separation. In other gels consisting of a mixture of hydroxypropyl-cellulose and aluminum magnesium silicate, a drop in viscosity over time is also observed, resulting in sedimentation of the active agents as a suspension and heterogeneity of the dispersion in the finished product.\nThis instability of benzoyl peroxide gels impairs their efficacy and their cosmetic utility.\nThere is thus still a need for a physically stable gelled composition containing benzoyl peroxide and a retinoin. 1. Field of the Invention\nThis invention relates to two-stroke cycle engines, and more particularly, to a two-stroke cycle engine having a linear gear drive in which a piston rod unit moves linearly and which includes a sealed crankcase or gearcase.\n2. Description of the Prior Art\nA linear gear drive system is a relatively simple gear relationship which includes a fixed ring gear and a pinion gear rotating within the ring gear for converting reciprocating linear motion into a balanced rotary motion.\nWith linear motion of a piston rod unit, the typical piston wrist pin required for non-linear type reciprocating engines is eliminated. This allows a smaller, lighter"} +{"output_text": " the number of the capacitor cells is limited, the excess capacity of the capacitor cells is not used.\nFurther, in the case of the conventional power supply unit, the power supply unit is connected to the halogen heater through a connector. Therefore, the power supply unit is not capable of being detached from the halogen heater. For this reason, when the power supply unit is to be detached from the halogen heater, the power supply unit is detached from the connector, and the connector is detached from the halogen", "input_text": " This is because the life of the halogen heater becomes short if a large current is supplied to the halogen heater. Therefore, in order to supply high power to the halogen heater, the voltage needs to be raised.\nHowever, the mass capacitor has an inherent characteristic in that the voltage per one capacitor cell is as low as about several volts, a little more than 1 V in the case of a hydro-system, and a little less than 3 V in the case of an organic system. The low voltage is for preventing an electrolytic solution from forming inside the capacitor cell of the mass capacitor. For this reason, when the halogen heater conventionally used is to generate heat for heating, dozens of the mass capacitor cells are connected in series to make a power supply unit capable of supplying about 50 V through 100 V to the halogen heater.\nInstalling the power supply unit of a high voltage in the apparatus, however, poses the following problems. Although an access to the inside of the apparatus is in many cases performed by a maintenance person, a power supply terminal may be inadvertently touched during maintenance work, and an electric shock accident may occur. Further, it is conceivable that a general office worker accesses inside the apparatus for removing a jammed sheet of paper, and the like. For this reason, a preventive measure against an electric shock is required.\nFurther, as the storage capacity of a capacitor cell of the mass capacitor is becoming large, the number of the capacitor cells to be connected in series for obtaining the high voltage and high power is decreasing, and the fewer number of capacitor cells are capable of raising the temperature of the heating target. However, in order to obtain the high voltage using the mass capacitor, it is necessary to increase the number of the capacitor cells, and in other words, an excess capacity of the capacitor cells has to be provided as the configuration of the power supply unit. At present, since"} +{"output_text": " input operation unit that has selected the information concerning the presence of the data. The controller then registers the ID number of the input operation unit in the management table.\nIn the aforementioned data transferring and receiving apparatus, the display/input unit may be operated by a single input operation unit or a number of input operation units, each having an ID number. The controller any identify, upon selecting the information concerning the presence of the data indicated on the display/input unit by the input operation unit, the ID", "input_text": " is registered in the checked management table, the controller requests the device on the network indicted by the management information to transmit data corresponding to the management information and receives the data from the device on the network and stores it in the data storage unit.\nIn the aforementioned data transferring and receiving apparatus, there is provided a management table on the network in which the management information representing the information concerning the presence of the data selected by the input operation unit is registered. The management table on the network is shared with other apparatuses connected to the network. When the information concerning the presence of the data is selected by the input operation unit, the management information representing the information concerning the selected data is registered in the management table. When the apparatus of the present invention or another apparatus is selected by the input operation unit, the selected apparatus checks for the management information registered in the management table and requests the apparatus of the present invention to transmit data indicated by the management information. The data transferring and receiving apparatus transmits the data corresponding to the management information to the apparatus that as requested data transmission. Further, in this data transferring and receiving apparatus, when the display/input unit is selected by the input operation unit, the management information registered in the management table is checked, and a request is provided to the apparatus of the present invention or another apparatus indicated by the management information registered in the management table to transmit data corresponding to the management information. The apparatus that has received the data transmission request transmits the data corresponding to the management information. Then, the data transferring and receiving apparatus receives the data and stores it.\nIn the aforementioned data transferring and receiving apparatus, the display/input unit may be operated by a single input operation unit or a number of input operation units, each having an ID number. The controller any identify, upon selecting the information concerning the presence of the data indicated on the display/input unit by the input operation unit, the ID number of the"} +{"output_text": " these areas, high frequency ultrasound imaging is advantageous because it allows for the visualization of small structures and the study of hemodynamic function.\nHigh frequency ultrasound imaging is based on the principle that the speed of sound in tissue is different from the speed of sound in blood. The speed of sound in tissue is dependent on the density of the tissue and the temperature of the tissue. The speed of sound in blood is dependent on the density of the blood and the temperature of the blood. The speed of sound in tissue", "input_text": " studied. When a different animal must be used, inaccuracies are inherently introduced into a researcher's findings. These inaccuracies may be due to individual differences between study animals, differing husbandry conditions, or any other number of potential differences. All of these drawbacks increase the cost of research by increasing the number of animals needed and by making poor results more likely.\nNon-invasive ultrasound has long been used as a diagnostic tool to aid in therapeutic procedures. It is based on the principle that waves of sound energy can be focused upon an area of interest, reflected and processed to produce an image. To improve the images obtained using conventional, or low frequency ultrasound, echogenic contrast agents are sometimes used to create a reflector of ultrasonic energy in an area of interest. In conventional frequency ultrasound, a rapid development of microbubble contrast imaging techniques for medical ultrasound has occured. Non-linear scattering from resonant bubble population has been exploited to implement a variety of detection methods, which are used to suppress tissue signals and enhance the detection of blood. Non-linear microbubble imaging for commercial ultrasound operating at conventional frequencies has demonstrated important clinical utility in improving structure visualization including improving small vessel detection and in cardiac chamber imaging. Because conventional frequency ultrasound operates in the 1-8 MHz range, microbubble contrast agents have been designed to work well within this frequency range.\nUltrasound has recently been adapted for use in small animal research. In particular, high frequency ultrasound has been used to visualize anatomical structures and hemodynamic function in longitudinal studies of small animals. High frequency ultrasound imaging of small animals is non-invasive and allows longitudinal studies of individual animals. These studies reduce the number of animals required for analysis and alleviate many problems associated with invasive surgery. Potential areas of small animal research where high frequency ultrasound imaging is beneficial include, but are not limited to, cancer and angiogenesis studies, developmental biology, cardiovascular research and neurological research. In each of"} +{"output_text": " the flash light emission. The Yamada patent also discloses a process for initializing a phase change recording medium using a laser beam which is modulated by a pulse signal. The pulse signal is generated by a pulse generator which is controlled by a controller. The controller is programmed to generate the pulse signal at a predetermined time interval. The controller is also programmed to generate the pulse signal at a predetermined pulse width. The pulse width is selected to be shorter than the time interval. The controller is also programmed to generate", "input_text": " data. For example, a phase change optical data storage device may have a chalcogenide memory material used as the active memory layer. The chalcogenide memory material may have an amorphous state, a crystalline state and varying intermediate states. When the phase change material is deposited on a disk, the material is formed in an essentially amorphous state. The crystallization characteristics of the material are different in first crystallization from all subsequent crystallizations. Therefore, the disk must be crystallized once before the disk is ready for use. This first crystallization is referred to as initialization. Thus, preparation of a newly manufactured phase change data storage device requires that the device be initialized into a crystalline state so that data can be reliably written and erased.\nPresent day disk initialization is typically carried out by directing a continuous laser beam along a track of an optical disk as the disk rotates. The laser energy is utilized to change the recording medium from an amorphous state to a crystalline state. An example of this process has been described in U.S. Pat. No. 5,768,221 which issued to Kasami et al. Therein is disclosed a method for initializing an optical recording medium by rotating a disk at a speed of 1000 rpm and directing a laser beam upon a small portion of the recording medium. The laser beam then moves from the inner most to the outermost circumference sections of the disk. Therefore, as the method serially initializes the disk, the method is relatively slow and does not lend itself to efficient mass production.\nOne attempt made at initializing the entire optical disk with a single energy exposure has been described in U.S. Pat. No. 5,684,778, which issued to Yamada et al. on Nov. 4, 1997. The Yamada patent discloses a process for initializing a phase change recording medium using flash light emission which drops the emission strength instantly to a zero level immediately after"} +{"output_text": " into the cache memory in serial order.\nIn the microprocessor, the data DA2 is fetched into the cache memory from the external memory in parallel with the data DA1. Therefore, the data DA2 is not necessarily read out from the cache memory to the arithmetic unit after the data DA1 is read out from the cache memory to the arithmetic unit.\nTherefore, when the arithmetic processing is braked for the debugging operations, the program executed in the arithmetic unit is stopped at a step", "input_text": ", a group of sequential pieces of data DA1 to DA4 is fetched from the external memory into the cache memory in serial order. That is, the sequential pieces of data DA1 to DA4 are stored at addresses AD1 to AD4 of the cache memory. In this case, the data DA2 is not detected by a data detector because the data DA2 is not required by the data signal even though the data DA2 is fetched into the cache memory from the external memory.\nThereafter, the data DA1 stored at the address AD1 of the cache memory is read out from the cache memory to utilize for the arithmetic processing in the arithmetic unit. Thereafter, the data DA2 stored at the address AD2 of the cache memory is read out from the cache memory without the occurance of the cache miss.\nTherefore, it is impossible to specify the data DA2 accessed by the arithmetic unit in the microprocessor even though data signals transmitted through the external bus are monitorred without monitorring a piece of data processed in the arithmetic unit.\nAlso, even though the data DA2 is detected by the data detector when the sequential pieces of data DA1 to DA4 are stored at addresses AD1 to AD4 of the cache memory, the data DA2 is not necessarily read out by the arithmetic unit after the data DA1 is read out by the arithmetic unit. Therefore, when the arithmetic processing is braked for the debugging operations, the program executed in the arithmetic unit is stopped at a step not relating to the data DA2.\nAs mentioned above, even though the arithmetic processing executed in the arithmetic unit is scheduled to be braked for the debugging operations when the data DA2 relating to the cache miss is read out from the cache memory to the arithmetic unit, it is impossible to specify the data DA2 because the sequential pieces of data including the data DA2 are fetched"} +{"output_text": " factor, and xcex is the wavelength of the exposure laser beam LR.\nIn the exposure apparatus 1, the exposure laser beam LR is emitted from the laser beam source 10, and the return beam is received by the position detecting device 33. The return beam is reflected by the xc2xc wavelength plate 32, and is separated into two beams by the polarization beam splitter 32. The return beam is received by the position detecting device 33, and the distance between the objective lens 19 and the resist", "input_text": " that the optical axis of the laser beam LF is spaced apart from the optical axis of the exposure laser beam LR, the optical axis of which substantially coincides with the optical axis of the optical system including the objective lens 19, etc.\nDue to this arrangement, in the auto focus optical system, regarding the return beam, which is obtained through specular reflection at the resist layer of the disc master 2 of the laser beam LF obliquely impinging upon the disc master 2, the position of the optical axis varies according to the distance between the objective lens 19 and the resist layer. Regarding the return beam thus obtained, the auto focus optical system imparts a phase difference thereto when it reversely travels through the optical path of the laser beam LF and is transmitted through the xc2xc wavelength plate 32, whereby the laser beam LF is subsequently separated by the polarization beam splitter 32. The beam is further received by a position detecting device 33, and the distance between the objective lens 19 and the resist layer is detected from the light receiving position.\nIn the auto focus optical system, the optical axis position, etc. of the laser beam LF is adjusted such that the variation in the condensing position of the return beam at the position detecting device is approximately 100 times the variation in the distance between the objective lens 19 and the resist layer, and the objective lens 19 is displaced in the optical axis direction to thereby effect focus control.\nThe exposure apparatus 1 is mounted on an air base plate so that the optical systems and the mechanical system may not be affected by external vibrations of the place of installment, whereby the exposure accuracy can be improved.\nIn the exposure apparatus 1, which performs exposure on the disc master 2 in this way, assuming that the resolution is P, P=Kxc2x7(xcex/NA), where NA is the numerical aperture of the objective lens 19, K is the process"} +{"output_text": " window region and an active region, a first compound semiconductor layer formed on the substrate, a second compound semiconductor layer formed on the first compound semiconductor layer, a window region formed in the second compound semiconductor layer, a window region formed in the first compound semiconductor layer, and a mesa stripe formed in the second compound semiconductor layer and the first compound semiconductor layer, wherein the window region is formed in the second compound semiconductor layer and the first compound semiconductor layer, and the mesa stripe is formed in the second", "input_text": " thickness of the compound semiconductor layers 38/39/40 or more serious than it. Upon completion of the wet etching, the dispersion of the step between the window region and the active region is further dependent on the dispersion of the depth over the semiconductor wafer, and is determined from both of the dispersion in the step and the dispersion of the depth.\nEven though the etching stopper enhances the controllability of the wet etching and makes the dispersion of the depth insignificant, the dispersion of the thickness is still left on the semiconductor wafer, and the dispersion of the coupling efficiency takes places at the boundary between the window region and the active region in the horizontal transverse mode due to the dispersion of the thickness. This is the second problem.\nEven if the window region is equalized in height with the active region, roughness is still left around the boundary between the window region and the active region. The rough surface results in roughness of the mesa bottom around the boundary upon completion of the mesa stripe through the etching. Moreover, the width of the SiO2 stripe 51 is not constant due to the irregularity of the thickness of the photo-resist mask formed over the rough surface. This results in that the mesa bottom is not constant in width. The roughness and the non-uniform width makes large scattering loss against the oscillation light in the horizontal transverse mode. This is the third problem.\nIt is therefore an important object of the present invention to provide a window type semiconductor laser light emitting device, which has a low coupling loss at the boundary between a window region and an active region in the transverse mode and a low scattering loss at the boundary.\nIt is also an important object of the present invention to provide a process for fabricating the window type semiconductor laser light emitting device.\nIn accordance with one aspect of the present invention, there is provided a window type semiconductor laser light emitting device comprising a substrate having a"} +{"output_text": " by the addition of certain transition metal complexes. The most important of these complexes are the transition metal complexes of the type [M(II)(L)2]n+ where M is a transition metal, L is a ligand and n is the number of ligands. The transition metal complexes are usually added in the form of their salts. The most important transition metal complexes are the iron complexes of the type [Fe(II)(L)2]n+ where L is a ligand and n is the", "input_text": " lignocellulosic raw material has been removed during the pretreatment step of the present invention and fresh catalytically active metals and metal complexes may, as specified herein, be added within or before the oxygen dilignificaton stage. The quantity of metals compounds added must be controlled since a too high concentration not only hinders the initiation of the desired reactions, but also lowers the selectivity because the rate of radical chain oxidation is usually limited by oxygen transport through the liquor to the reactive sites. Too high catalytic activity leads to oxygen deficiency or starvation and the excess radicals are reacting along undesired paths.\nThe active transition metal catalysts used to enhance oxygen delignification selectivity in accordance with the invention are present in concentrations ranging from 10 ppm to 5000 ppm calculated on dry lignocellulosic material and more preferably in the range of 10 to 300 ppm.\nIt is thus a major objective of the present invention to control the metal profiles in the oxygen delignification stage by addition of catalytic substances comprising metals or metal complexes combined with addition of carbohydrate protector substances to effect rapid delignification while preventing carbohydrate depolymerisation.\nIt is normally desired to produce as strong pulp as possible and the preservation of carbohydrates during delignification is specifically important. A low degree of carbohydrate degradation is reflected by a high molecular weight distribution in the pulp and preserved physical strength properties in the pulp product.\nIn order to protect the carbohydrates from excessive degradation it is desirable to carry out the oxygen delignification stage in the presence of radical scavengers and carbohydrate degradation inhibitors or carbohydrate protectors or mixtures of these substances.\nThe inhibitors or carbohydrate protectors can act through several different pathways such as hindrance of the formation of the active radicals and intermediates, by lowering their concentrations through complexing or simply by decomposing the undesired species.\nIt was discovered in the sixties and seventies that carbohydrate degradation during oxygen delignification was retarded"} +{"output_text": ", PITP has been implicated in the formation of secretory vesicles in yeast (Ohashi, M. et al. (1995) supra).\nThe yeast PITP homologue, PIT1, has been shown to be required for the formation of secretory vesicles in yeast (Ohashi, M. et al. (1995) supra). PIT1 is a cytosolic protein that is localized to the trans-Golgi network (TGN) and is required for the formation of secretory", "input_text": " it down, but also to bring about the necessary motion and force transfer processes by means of hydraulically or pneumatically acting control elements without using long and difficultly manufacturable threaded spindles or the like, so that even with small constructional sizes high power levels can be transferred (DAS No. 25 40 864). Membrane biogenesis is essential for cell growth and differentiation. During membrane biogenesis in eukaryotic cells, newly synthesized phospholipids must be transported from their sites of synthesis to their sites of function. Vesicular traffic in eukaryotic cells is characterized by two steps of membrane rearrangement: the formation of vesicles from donor membranes and fusion of these vesicles with acceptor membranes. With respect to vesicle formation, several of the cytosolic proteins implicated in budding and fission have been identified. These stimulate the formation of constitutive secretory vesicles and immature secretory granules from the trans-Golgi network.\nPhosphatidylinositol transfer protein (PITP) is a member of a diverse set of cytosolic lipid transfer proteins that are distinguished by their ability to transfer phospholipids between membranes in vitro and to take part in secretory vesicle formation (Wirtz, K. W. A. (1991) Ann. Rev. Biochem. 60:73-99; Ohashi, M. et al. (1995) Nature 377:544-547). PITP has been purified from mammals, plants, fungi and bacteria (Wirtz, K. W. A. (1991) supra).\nPITPs have raised considerable interest because of their proposed roles in the phosphoinositide cycle and the ATP-dependent, Ca.sup.2+ -activated secretory process (Thomas, G. M. et al. (1993) Cell 74:919-928; Hay, J. C. and Martin, T. F. J. (1993) Nature 366:572-575). Furthermore"} +{"output_text": ", aryl, arylalkyl, arylalkoxy, arylalkylamino, arylalkyloxy, arylalkylthio, arylalkylsulfonyl, arylalkylsulfonylamino, arylalkylsulfonyloxy, arylalkylsulfinyl, arylalkylsulfonylamino, arylalkylsulfinyloxy, arylalkylsulfonylaminoalkyl, arylalkylsulfinyloxyalkyl, arylalkylsulfonylaminoalkyl, arylalkylsulfinyloxyalkyl, arylalkylsulfonylaminoalkylamino, arylalkylsulfinyloxyalkylamino", "input_text": " The use of two or more anti-viral agents to provide improved therapy for the treatment of hepatitis B virus infections is desirable due to the morbidity and mortality of the disease. Combination therapy is also desirable since it should reduce toxicity in patients as it enables the physician to administer lower doses of one or more of the drugs being given to a patient. Combination therapy can also help to prevent the development of drug resistance in patients (Wiltink, E. H. H., Pharmaceutish Weekblads Scientific Edition (1992) 14(4A):268-274). The result of an improved efficacy configuration combined with a relative lack of toxicity and development of resistance would provide a much improved drug treatment profile.\nThe present inventor has surprisingly discovered that the combined use of N-substituted-l,5-dideoxy-1,5-imino-D-glucitol compounds and nucleoside or nucleotide antiviral compounds, or combinations thereof, and/or immunomodulators/immunostimulants, results in unexpectedly greater anti-hepatitis virus effectiveness of the compounds compared to the combined antiviral activities expected of the individual compounds alone. Whether this is due to different mechanisms of action of the different classes of drugs employed or some other biological phenomenon is presently unclear.\nAccordingly, in a first aspect, the present invention provides a method of treating a hepatitis virus infection in a mammal, comprising administering to said mammal a first amount of an N-substituted1,5-dideoxy-1,5-imino-D-glucitol compound of Formula I: \nwherein:\nR is selected from the group consisting of arylalkyl, cycloalkylalkyl, and branched or straight chain alkyl having a chain length of C1 to C20, and\nW, X, Y, and Z are each independently selected from the group consisting of hydrogen, alkanoyl, aroyl"} +{"output_text": " at a high speed, a fixing system that uses a heater is known.\nJapanese Provisional Patent No. H10-282821 discloses a fixing apparatus that uses a heater in addition to the main power supply.\nJapanese Provisional Patent No. 2000-315567 discloses a fixing apparatus that uses a heater in addition to the main power supply.\nJapanese Provisional Patent No. 2000-075737 discloses a fixing apparatus that uses a heater in addition to the main power", "input_text": " safety is realized.\nJapanese Provisional Patent No. H5-232839 discloses a heating apparatus wherein an auxiliary power supply provides power to a second heating element, rather than increasing the power to a first heater for heating the fixing roller.\nJapanese Provisional Patent No. H10-10913 discloses an energy-saving type fixing apparatus that employs an auxiliary power supply. With this fixing apparatus, the rechargeable battery serving as the auxiliary power supply is provided in order to obtain two levels of power from a single power supply. It does not aim at supplying power greater than the power available from only the main power supply.\nJapanese Provisional Patent No. H10-282821 discloses an image forming apparatus that uses an auxiliary power supply, such as a rechargeable battery and a primary battery, in addition to the main power supply for providing various functions.\nJapanese Provisional Patent No. 2000-315567 discloses a heating apparatus using a mass capacitor in addition to the main power supply as an auxiliary power supply. According to this heating apparatus, the auxiliary power supply assists the commercial power supply at the time of starting; thereby heating time is shortened, saving energy.\nJapanese Provisional Patent No. 2000-075737 discloses an image forming apparatus equipped with a power supply based on the commercial power supply and a storage battery, including storage battery checking means for determining presence of the storage battery, and charge capacity surveillance means for supervising charging capacity of the storage battery, wherein productivity is reduced during the charging of the storage battery based on determinations of the storage battery checking means and the charge capacity surveillance means.\nFurther, according to Japanese Provisional Patent No. 2000-075737, charging a storage battery is carried out externally and during night hours, for charging the storage battery takes a long time.\nAs a fixing system that realizes the temperature rise of the image forming apparatus"} +{"output_text": " the critical point. The working fluid is in a supercritical state when the temperature is above the critical temperature. The working fluid is in a subcritical state when the temperature is below the critical temperature. The working fluid is in a supercritical state when the pressure is above the critical pressure. The working fluid is in a subcritical state when the pressure is below the critical pressure.\nThe working fluid is in a subcritical state when the temperature is below the critical temperature. The working fluid is in", "input_text": " side, a heat pump will \u201cmultiply\u201d the heat as compared to resistive heat generation. The ratio of heat output to work input is called coefficient of performance, and it is a value larger than one. In this way, the use of a heat pump will increase the round-trip efficiency of a TEES system.\nIn EP 08162614, the applicant has described the concept of utilizing transcritical thermodynamic cycles to improve TEES systems. FIG. 1 illustrates temperature profiles in a heat exchanger in contact with a thermal storage medium during charging and discharging cycles of a transcritical TEES system. The abscissa represents the provided heat in the system, the ordinate represents the temperature, and the lines on the graph are isobars. The solid line indicates the temperature profile of the working fluid in a transcritical TEES charging cycle. The dotted line indicates the temperature profile of the working fluid in a transcritical TEES discharging cycle. The straight diagonal dashed line indicates the temperature profile of the thermal storage medium in a transcritical TEES cycle. Heat can only flow from a higher to a lower temperature. Consequently, the characteristic profile for the working fluid during cooling in the charging cycle has to be above the characteristic profile for the thermal storage media, which in turn has to be above the characteristic profile for the working fluid during heating in the discharging cycle. The temperature profiles are stationary in time due to the sensible heat storage in the thermal storage medium. Thus, while the volume of thermal storage medium in the heat exchanger remains constant, the volume of the hot and cold thermal storage medium stored in the hot-fluid and cold-fluid storage tanks changes. Also, the temperature distribution in the heat exchanger remains constant.\nA transcritical cycle is defined as a thermodynamic cycle where the working fluid goes through both subcritical and supercritical states. There is no distinction between a gas phase and a vapor phase beyond"} +{"output_text": " first refractory metal layer, and a layer comprising tungsten over the aluminum layer. The tungsten layer is patterned to form an opening therein, with the opening exposing an underlying portion of the aluminum layer. The patterned tungsten layer is then exposed to an etch having a substantially higher etch rate of the aluminum layer than of the tungsten layer to remove the exposed portion of the aluminum layer.\nIn accordance with another aspect of the invention, a method is provided which includes providing a layer which contains aluminum.", "input_text": ".\nA limitation of this patterning process employing photoresist arises for devices with sub-0.25 micron features which have metal stacks which are even higher than those used in previous generations. The photoresist will eventually not be able to withstand the etch processes required for higher metal stacks with shrinking device geometries.\nAnother limitation of photoresist-based etching processes is sidewall polymer. removal and corrosion control. During the etch, chlorine-containing polymer-like layers are formed on the sidewalls of etched features when chlorine reacts with the photoresist. Then, upon exposure to moisture, the chlorine on the wafer may give rise to serious corrosion problems.\nAn alternative approach to photoresist masks has been the implementation of a hard mask. Different dielectric materials have been suggested for a hard mask, such as SiO2 or Si3N4. However, these materials are not conductive. Thus, after the metal lines are defined, the dielectric hard masks must be removed to enable contact to subsequent metallization layers. Also, anti-reflective coatings must be used for patterning the photoresist, to counteract the reflectivity of the aluminum which interferes with the definition of fine lines and spaces.\nIn accordance with one embodiment of the invention, a method is provided which includes providing a layer which contains aluminum. A tungsten (W)-containing layer is provided over the aluminum-containing layer. The tungsten-containing layer is patterned to form an opening therein, with the opening exposing an underlying portion of the aluminum-containing layer. The patterned tungsten-containing layer is then exposed to an etch having a substantially higher etch rate of the aluminum-containing layer than of the tungsten-containing layer to remove the exposed portion of the aluminum-containing layer. In some embodiments, the etch is a dry etch.\nIn accordance with another aspect of the invention, a metal stack is provided which includes a first refractory metal layer, a layer comprising aluminum over the"} +{"output_text": "-connectors were used to make the physical connection.\nThe original Ethernet was a shared medium, meaning that all stations on the network could transmit and receive at the same time. This was a big advantage for the early days of the Internet, when the network was used mostly for file transfers.\nThe Ethernet protocol was designed to be simple and easy to implement. It was also designed to be robust, so that it could be used in a variety of environments. The original Ethernet was designed to be", "input_text": " This works, but is inefficient because the line is held open and null data is being sent between the bursts of data that matter. If one wants to send audio over packet networks, the continuous audio data must be converted into packets and then the packets are reconverted into audio signals back together at the receiving end.\nEfforts to improve this cumbersome process make sense because: computer networks are much cheaper these days than circuit-oriented networks owing to their ubiquity and high-volume, it is often desirable to have both audio and data simultaneously on the same network, and computers are now very often either the source or destination for audio signals. \nOne example that illustrates a convergence of the two networks styles most clearly in the VOIP (Voice Over Internet Protocol) telephone application that is rapidly gaining popularity. The idea is that only one cable is needed to connect both a PC and a telephone. The switch that makes this happen is a cheap commodity Ethernet switch rather than an expensive proprietary PBX. The cost benefit is significant.\nThe same reasoning applies to the high-fidelity audio networks used in radio stations and other studio facilities, with their expensive PBX-like router switches at the core. Thus, the motive to use Ethernet for audio transmission.\nOriginal Ethernet\nOriginally, Ethernet networks were packet networks, but by convention, Ethernet packets are also called frames, (not to be confused with the term audio frames used later in this application). These range from 72 to 1526 bytes, depending on the amount of data to be carried. The original Ethernet was based on a single shared coaxial cable\u2014the Ether in Ethernet's name. The very first versions used a \u00bd\u2033 thick cable with physical taps into it\u2014one actually had to cut a little piece out of the jacket and screw in a metal part that made contact with the ground and center conductors. Later, the coax cable was smaller and T"} +{"output_text": " first output signals \u2018Y0\u2019 and \u2018Y1\u2019 at low level, regardless of the input signal \u2018in\u2019.\nThe MDS 20 is a logic for detecting a metastable state of the resolver 10. The MDS 20 receives the first output signals \u2018Y0\u2019 and \u2018Y1\u2019 and the internal clock signal \u2018CLK\u2019, and generates second output signals \u2018Y2\u2019 and \u2018Y3\u2019. Specifically, if the internal clock signal \u2018CLK\u2019 is at a low level,", "input_text": ". Related Art\nRecently, restrictions inherent in conventional asynchronous circuits and new challenges for timing detection present by conventional semiconductor devices, a system that externally operates in an asynchronous manner but internally operates based on a clock signal, that is in a synchronous manner, has been introduced. Such a system is called an EAIC (Externally Asynchronous-Internally Clocked) system. In an EAIC system, when viewed from outside the device, only an external output signal, which responds to an external input signal, seems to be present, but actually an internal clock signal is generated on the basis of the input signal.\nA DFLOP is used for a shift register in the EAIC system. FIGS. 1 and 2 are a conceptual view and a block diagram of a conventional DFLOP. The DFLOP of FIGS. 1 and 2 operates similarly to a D flip-flop, except that if a signal transmission operation is completed, then a ready signal R is generated to inform that the DFLOP is ready to execute a new operation.\nReferring to FIGS. 1 and 2, the DFLOP includes a resolver 10, a metastable detection stage (MDS) 20, a latch 30, and a ready signal generator 40.\nFirst, the resolver 10 is a flip-flop type logic for defining operation conditions according to an input signal. The resolver 10 receives an input signal \u2018in\u2019 and an internally generated internal clock signal \u2018CLK\u2019, and generates first output signals \u2018Y0\u2019 and \u2018Y1\u2019. Specifically, if the internal clock signal \u2018CLK\u2019 is at a low level, then the resolver 10 generates the first output signals \u2018Y0\u2019 and \u2018Y1\u2019 at high level, regardless of the input signal \u2018in\u2019. When the internal clock signal \u2018CLK\u2019 is changed to a high level, then the resolver 10 generates the"} +{"output_text": "-63-140104, JP-A-63-140105, JP-A-63-140106, JP-A-63-140107, JP-A-63-140108, JP-A-63-140109, JP-A-63-140110, JP-A-63-140111, JP-A-63-140112, JP-A-63-140113, JP-A", "input_text": " coil grooves.\nFIGS. 21D, 21E and 21F are schemes representing another related art example of the stator core corresponding to the rotor core shown in FIGS. 21A to 21C, FIG. 21D being a plan view; FIG. 21E being a cross sectional view; FIG. 21F being a back view; in which reference numerals 81 and 83 to 830 to 850 indicate coil grooves.\nFIGS. 22A, 22B and 22C are schemes representing still another example of the rotor core, FIG. 22A being a plan view; FIG. 22B being a cross sectional view; FIG. 22C being a back view; in which 401a, and 403a, 403b to 405a, 405b are small holes, which are formed by dividing one throughhole having a diameter of about 1 mm into two holes with a partitioning member 10 made of non-magnetic resin, etc., in order to draw out an uninsulated coil through the same throughhole.\nFIGS. 22D, 22E and 22F are schemes representing an example of the stator core corresponding to the rotor core shown in FIGS. 22A to 22C, FIG. 22D being a plan view; FIG. 22E being a cross sectional view; FIG. 22F being a back view; in which 801a, 801b and 803a, and 803b to 805a, 805b are small holes, which are formed by dividing one throughhole into two holes with a partitioning member 10 made of non-magnetic resin, etc.\nIn the above related art rotary magnetic head devices and rotary transformers used therefor have been described. As literature describing such related art techniques, JP-A-59-78508, JP-A-61-201405, JP-A-62-179107, JP-U-A"} +{"output_text": " simple design.\nThe fluidification stage in accordance with the invention is also characterized by the fact that the starchy material is fluidified in the fluidification reactor by means of a fluidizing gas which is at least partially constituted by steam.\nThe steam is preferably introduced into the fluidization reactor in the form of a steam-air mixture.\nThe steam-air mixture is preferably introduced into the fluidization reactor in the form of a steam-air mixture which is at least partially constituted by steam", "input_text": " temperature at the core of the fluidification reactor is somewhere between 65xc2x0 and 85xc2x0 C. approximately.\nA first concern of industry in the starchy materials conversion process in accordance with the invention is to expect, during the actual chemical fluidification stage, relatively short reaction times, illustrated by dwell times of the starchy material in the fluidification reactor of less than 30 minutes approximately.\nThus, in the context of continuously conducted operations, the volume of starchy material being fluidified in the reactor at any given time is limited compared with a technology anticipating significantly greater dwell times as described for example in the aforementioned WO patent application 97/137988.\nAs a result of which, in the event of any drift of the fluidification operation during the implementation of the process of the invention, in particular in case, for a given reason, the starchy materials have been fluidified beyond the required amount, the volumes of products which are unsatisfactory and needing to be downgraded would be heavily reduced compared with those which, in the same case, would ensue from a fluidification technology with a (relatively) high dwell time.\nThis constitutes an undeniable economic and industrial advantage.\nOn the other hand, dwell times expected in compliance with the invention are sufficiently long ( greater than 5 minutes) not to involve necessarily, even if it can be expected, the additional use of very specific devices necessary for a fluidification reaction which is at once ultra fast or almost instantaneous, homogeneous and controlled. Such devices are, for example, described in the aforementioned patents EP 041,316 (UHF radiation device) and EP 710,670 (turbo-reactor).\nAl already indicated, the fluidification stage complying with the present invention has the additional interest of being able to be conducted on non-specific devices which are straightforward to use, and particularly on reactors of very"} +{"output_text": " the figure, provided on both the first surface and the second surface.\nThe thermal developing apparatus 100 includes a thermal developing part 101, a thermal developing transferring part 102, a thermal developing fixing part 103, and a thermal developing cooling part 104.\nThe thermal developing part 101 heats the recording material A to visualize the latent image recorded on the image forming layer A2. The thermal developing part 101 includes a heating roller 101a and a pressure roller 101b. The heating roller 101a is heated", "input_text": " the image forming layer on the side not heated is delayed. Due to the delay in development, deviation occurs in color tone, for example, the color of the image forming layer is discolored in brown. Furthermore, in the case where heat is not sufficiently transmitted to the image forming layer on the side not heated, development thereof becomes insufficient to cause density fluctuation, in which the density thereof is reduced.\nOn the other hand, in a thermal developing transferring part in a thermal developing image forming apparatus, in which a recording material is also heated from the side having no image forming layer, i.e., a image forming layer formed only on one surface is auxiliary heated, the heating operation does not intend to heat image forming layers provided on both sides, and therefore, difference in development occurs between the image forming layers on front and back surfaces to cause deviation in color tone and fluctuation in density.\nThe inventors have developed such a thermal developing apparatus that solve the problems associated with the aforementioned conventional technique.\nFIG. 1 is a constitutional diagram showing a first embodiment of a thermal developing apparatus of the invention, and FIG. 2 is a cross sectional view of a photosensitive thermal developing recording material used therein.\nIn FIG. 1, a thermal developing apparatus 100 heats a photosensitive thermal developing recording material (recording material) A to visualize a latent image recorded on an image forming layer. As shown in FIG. 2, the recording material A used in the thermal developing apparatus 100 has image forming layers A2 and A2 each containing a photosensitive material provided on both the first surface as one surface (for example, a front surface) of a support A1 and the second surface as the other surface (for example, a back surface) thereof.\nIn the thermal developing apparatus 100, such a recording material A can be used that is a double sided photosensitive film having fluorescent intensifying screens, which are not shown in"} +{"output_text": ", the number of searches per copy grows as the square of the number of members, which is equivalent to the square of the number of searches divided by the number of copies. As the number of members grows linearly with N, the number of searches per copy grows linearly with N. Therefore, the number of searches per copy grows as the square of the number of members divided by the number of copies. As both numbers grow linearly with N, the number of searches per copy remains constant.\nIn a", "input_text": " difficulties coping with the load, and can only work if a large fraction of the queries are solved by means of caches. Old copies of the name to address resolutions linger in these caches, however, which makes fast updates difficult. Further, the centralized server is a point of political, legal and commercial control. These controls can interfere with the reliability of the service. One may be tempted to dismiss these weaknesses as mere scaling issues, but it is very clear that they derive directly from the use of centralized services.\nIn Gnutella, the database is fractioned into a large number of components. A global search is performed by executing parallel searches on a copy of each component and merging the results. This form of spreading trades memory, the footprint of the database on each node, for messages and computation. If the database is partitioned in P components, for example, then each request will request at least P messages and fill trigger searches in at least P nodes. If the dataset is limited in size, then the number of components P is entirely a function of the relation between the size of the dataset and the maximum size S that a given node can store. In that case, the system scales if the number P of components is basically a constant. However, as the number N of nodes increases, the number of copies of a given component grows as 0(N/P), which is equivalent to 0(N). As such, the number of searches grows as the number of nodes, 0(N). Therefore, the number of searches that a given copy of a component must process scales as the number of searches divided by the number of copies. As both numbers grow linearly with N, the number of searches per copy remains constant.\nUnfortunately, in a name server application both the size of the database and the number of searches grow linearly with N, the number of members. This presents a scaling problem. Specifically"} +{"output_text": " own fast axis collimator being provided.\nThe fast axis collimators of the diode laser bars are made as micro optics, i.e. as small as possible, and are arranged in the beam path in the immediate vicinity of the emitters or of the facet of the diode laser bar. The fast axis collimators are made as micro optics, i.e. as small as possible, and are arranged in the beam path in the immediate vicinity of the emitters or of the", "input_text": " row in the plane parallel to the active layer, i.e. in the slow axis. The resulting overall beam of this bar in the plane parallel to the active layer has an opening angle of roughly 10xc2x0 and a beam diameter of roughly 10 mm. This yields a beam quality in this plane which is many times less than the beam quality in the plane perpendicular to the active layer.\nThe occupation density which results from the quotient of the radiating area of the laser bar to the total area in currently available diode laser bars is roughly 30-50%, in any case higher occupation densities allowing only pulsed operation of the laser. For continuous applications, smaller occupation densities are necessary.\nIn order the make the highly divergent radiation of a diode laser useful for laser applications, for example, material machining, medical technology, pumping of solid state lasers, etc., collimating and focusing optical arrangements are necessary in the beam path. These optical arrangements contain one fast axis collimator, which is made as micro optics, and which has the optical property of a cylinder lens which lies with its axis parallel to the slow axis, for all emitters of a diode laser bar its own continuous cylinder lens being used with a small focal distance in the immediate vicinity of the facet of the diode laser bar, i.e. at a distance of a few hundred mu from the emitters or from this facet. The divergence in the slow axis is then corrected by following macro-optics.\nTo attain higher powers, as are necessary, for example, in materials machining, in medical engineering, for pumping of solid state lasers, etc., providing several rows of emitters or several diode laser bars in a stack in several planes on top of one another is known, these planes being offset against one another in the direction of the fast axis and to each row of emitters, or each diode laser bar of each plane, its"} +{"output_text": " solution. The dissolved sulfur is then recovered as a sulfide salt.\nThe present invention is directed to a method for the production of a sulfur-containing product from a sulfur-containing feedstock. The method comprises the steps of: (a) contacting the feedstock with a catalyst to produce a reaction product; and (b) separating the reaction product from the catalyst. The catalyst comprises a metal oxide having a surface area of at least about 100 m2/g and a pore volume of at", "input_text": " low melting and agglomeration point and although various fluidized bed concepts have been disclosed for conversion of cellulose spent liquors, it is generally agreed that a suspension or entrained flow gasifier is more suitable for conversion of the highly alkaline liquor. Fixed bed gasifiers are not practical for conversion of liquid fuels.\nGasification or partial oxidation of black liquor in suspension bed gasifiers is presently being introduced on the market for recovery of chemicals and energy from kraft spent liquor. Gas generators of this type can advantageously be used for the recovery of chemicals from the spent cellulose liquors generated during the manufacturing of the chemical pulp in accordance with the present invention. The spent liquors can either be combusted completely in the gas generator or more preferably they can be partially oxidized in order to obtain a combustible gas. More specifically, a chemicals recovery system of the foregoing character would have the desired capability of recovering the chemicals and chemical reagents used in the oxygen delignification process of the present invention. Furthermore, recovery through partial oxidation of cellulose spent liquors provides better thermal efficiency and is substantially more cost effective relative to the traditional recovery boiler system.\nSeveral types of gasifiers can be used, with minor modifications, in the practice of the present invention including, for example, the gasifiers described in U.S. Pat. No. 4,917,763, U.S. Pat. No. 4,808,264 and U.S. Pat. No. 4,692,209. These gasification systems are, however, optimized for chemicals and energy recovery from high sulfidity cellulose spent liquors. The sulfur chemicals are recovered as alkali sulfides but a substantial portion of the sulfur will also follow the raw fuel gas as hydrogen sulfide and carbonyl sulfide. Entrained molten alkaline chemicals in the raw fuel gas are separated from the gas stream in a cooling and quenching stage and dissolved in an aqueous"} +{"output_text": " image sharpness.\nThe use of polymer latex particles in a hydrophilic light sensitive layer is described in U.S. Pat. No. 3,672,898. The use of polymer latex particles in a hydrophilic light insensitive layer is described in U.S. Pat. No. 3,672,899. The use of polymer latex particles in a hydrophilic light sensitive layer is described in U.S. Pat. No. 3,672,899. The use of polymer latex particles", "input_text": " the potential use of T3S as a medicament, taken as such or preferably in combination with other thyroid hormones or pro-hormones, like, for example T4. The fact that, only in some specific tissues of the body and under particular, peculiar circumstances, part of T3S can be reconverted into T3 does not mean, nor implies, nor suggests that it is possible to generalize this feature to the whole organism through exogenous administration of the product. In particular, there is no suggestion that oral administration of the product, even in protected form according to known methods of the pharmaceutical technique, may render it bioavailable also because it is well known that in those districts where suitable sulfatases are not present the same is rapidly metabolized and excreted through the bile and urines. It is known to use synthetic polymer particles in silver halide photographic elements to improve physical characteristics. In particular, polymer particles from 0.5.mu.m (500 nm) to 10.mu.m have found wide use as matting agents in an element to increase the surface roughness so as to reduce self-adhering of the material, to reduce sticking of the material to manufacturing and processing devices, to improve the antistatic properties of the material, and to improve the vacuum adhesiveness of the material in contact exposure to prevent Newton's rings. Polymer particles smaller than 500 nm obtained by emulsion polymerization technique (polymer latex particles) have found wide use as replacements for gelatin. For example, it has been proposed to use polymer latex particles in both hydrophilic light sensitive layers and hydrophilic light insensitive layers to improve the element dimensional stability, to improve element drying characteristics during photographic processing, to improve layer adhesion and flexibility, to reduce pressure fog, to control dye and image stability, to carry photographically useful compounds such as dyes, couplers, accelerators, hardeners, etc., and to improve the"} +{"output_text": " the reinforcing textile sheet and the other wing component member. Preferably, the other wing component member is a composite material wing component member formed by impregnating a reinforcing textile sheet with a thermosetting resin. Preferably, the composite material wing is a box structure having a closed space. Preferably, the other wing component member is a composite material wing component member formed by impregnating a reinforcing textile sheet with a thermosetting resin. Preferably, the composite material wing is a box structure having a", "input_text": " parts with the fasteners needs assembling jigs and takes much time.\nAccordingly, it is an object of the present invention to provide a method of fabricating an aircraft main wing of a composite material in a box structure having closed spaces capable of forming the main wing by a low-temperature half-setting process and a heat-setting process, of shortening a molding process and reducing assembling work.\nAccording to a first aspect of the present invention, a method of fabricating a composite material wing includes the steps of: superposing a reinforcing textile sheet on a mandrel; closely enclosing the reinforcing textile sheet superposed on the mandrel in a closed jig; introducing a thermosetting resin into the closed jig to impregnate the reinforcing textile sheet with the thermosetting resin; making the thermosetting resin impregnated into the superposed reinforcing textile sheet half-set to form a half-set composite material wing component member; taking the half-set composite material wing component member and the mandrel out of the jig; removing the mandrel from the half-set composite material wing component member; bonding the half-set composite material wing component member and an other wing component member with an adhesive to form an assembly; and heat-setting the assembly to complete a composite material wing.\nPreferably, the other wing component member is a composite material wing component member formed by impregnating a reinforcing textile sheet with a thermosetting resin. Preferably, the composite material wing is a box structure having a closed space. Preferably, each of the wing component members includes at least one of a front spar, a rear spar, a plurality of ribs extended between the front spar and the rear spar, an upper skin overlying the ribs, and a lower skin underlying the ribs. Preferably, at least one of the wing component members is an integral member formed by integrally combining"} +{"output_text": ". The gratings were fabricated on a glass substrate by means of a replica technique. The gratings were measured by means of a scanning electron microscope. The results of the measurements were compared with the results of theoretical calculations. The comparison showed a good agreement.\n(4) \u201cReconstruction of the Profile of Gold Wire Gratings. A comparison of Different Methods\u201d, H. Lochbihler et. al., Optik, 98 (1), pp. 21-25 (1994) deals", "input_text": " numerous publications. Publications, in which diffraction efficiency was measured versus wavelength, include, for example the following:\n(1) A. Roger and D. Maystre, J. Opt. Soc. Am, 70 (12), pp. 1483-1495 (1979) and A. Roger and D. Maystre, Optica Acta, 26 (4), pp. 447-460 (1979) describe and systematically analyze the problem of reconstruction of the line profile of a grating from its diffraction properties (the inverse scattering problem). A later article \u201cGrating Profile Reconstruction by an Inverse Scattering Method\u201d, A. Roger and M. Breidne, Optics Comm., 35 (3), pp. 299-302 (1980) discloses how the idea disclosed in the above articles can be experimentally used. The experimental results show that the line profile can be fitted such that the calculated diffraction efficiency will closely match the diffraction efficiency measured as a function of wavelength for \u201c\u22121\u201d diffraction order. The comparison of these experimental results with electron microscopy measurement showed a reasonable agreement.\n(2) \u201cReconstruction of the Profile of Gold Wire Gratings. A comparison of Different Methods\u201d, H. Lochbihler et. al., Optik, 98 (1), pp. 21-25 (1994) deals with the comparison of the results of several experimental techniques. Both optical transmittance and reflectance efficiencies were measured in the \u201c0\u201d order as a function of wavelength. By fitting the measurements to theoretical spectra calculated using diffraction theory, the grating profile was found. Comparison of these results with the results of X-ray diffraction efficiency and electron microscopy showed a good agreement.\n(3) Voskovtsova, L. M. et al., Soviet Journal of Optical Technology 60 (9) pp. 617-19 (1993) studies the properties of gratings fabricated by replica technique"} +{"output_text": " computer 20 may for example send the results back to the first computer 10 by sending a message to the first logic means 50, which then passes the message on to the first application program 40.\nThe present invention is not limited to the use of a single server computer 20, but may be used with a plurality of server computers 20, each of which is capable of performing a specific task.\nThe present invention is not limited to the use of a single client computer 10, but may be used with", "input_text": " a trademark of the IBM corp.). For the purposes of the present invention it is irrelevant whether the requests for communications services to be carried out by the server are instigated by user interaction with the first application program 40, or whether the application program 40 operates independently of user interaction and makes the requests automatically during the running of the program.\nWhen the client computer 10 wishes to make a request for the server computer 20\"\"s services, the first application program 40 informs the first logic means 50 of the service required. It may for example do this by sending the first logic means the name of a remote procedure along with a list of input and output parameters. The first logic means 50 then handles the task of establishing the necessary communications with the second computer 20 with reference to definitions of the available communications services stored in the storage device 60. All the possible services are defined as a cohesive framework of object classes 70, these classes being derived from a single object class. Defining the services in this way gives rise to a great number of advantages in terms of performance and reusability.\nTo establish the necessary communication with the server 20, the first logic means 50 determines which object class in the framework needs to be used, and then creates an instance of that object, a message being sent to that object so as to cause that object to invoke one of its methods. This gives rise to the establishment of the connection with the server computer 20 via the connection means 80, and the subsequent sending of a request to the second logic means 90.\nThe second logic means 90 then passes the request on to the second application program 100 (hereafter called the service application) running on the server computer 20 so that the service application 100 can perform the specific task required by that request, such as running a data retrieval procedure. Once this task has been completed the service application may need to send results back to the first computer 10. The server"} +{"output_text": " present invention further provides a selective reduction type, high temperature superconductor, characterized in that the selective reduction type, high temperature superconductor is made of a high temperature superconducting material that can be described by composition formula:\nCu1xe2x88x92xTlx(Ba1xe2x88x92ySry)2(Ca1xe2x88x92zLz)3Cu4O12xe2x88x92w\nwhere L represents one or more", "input_text": " temperature superconducting material that can be described by composition formula:\nCu1xe2x88x92xTlx(Ba1xe2x88x92ySry)2(Ca1xe2x88x92zLz)2Cu3O10xe2x88x92w\nwhere L represents one or more elements selected from the class which consists of Mg and alkaline metallic elements; 0xe2x89xa6xxe2x89xa61.0; 0xe2x89xa6yxe2x89xa61; 0xe2x89xa6zxe2x89xa61; and 0xe2x89xa6wxe2x89xa64.\nThe present invention also provides a selective reduction type, high temperature superconductor, characterized in that it is made of a high temperature superconducting material that can be described by composition formula:\nCu1xe2x88x92xTlx(Ba1xe2x88x92ySry)2(Ca1xe2x88x92zLz)3Cu4O12xe2x88x92w\nwhere L represents one or more elements selected from the class which consists of Mg and alkaline metallic elements; 0xe2x89xa6xxe2x89xa61.0; 0xe2x89xa6yxe2x89xa61; 0xe2x89xa6zxe2x89xa61; and 0xe2x89xa6wxe2x89xa64.\nThe present invention further provides a selective reduction type, high temperature superconductor, characterized in that selective over- or optimum-doping is effected by decrease in the valence number of ions of a constituent element by decrease in the oxygen concentration, that is by selective reduction, or by varying (increasing or decreasing) oxygen concentration.\nThe"} +{"output_text": ".\nA need therefore exists for a system and method for providing medical professionals with information about a particular subject in a timely manner. A further need exists for a system and method for providing medical professionals with information about a particular subject in a timely manner that does not require a substantial amount of time to locate relevant articles. Yet another need exists for a system and method for providing medical professionals with information about a particular subject in a timely manner that does not require a substantial amount of time to search through databases for", "input_text": "). The WWW is a part of the Internet that may be accessed by using a browser application. Netscape Communicator and Microsoft Internet Explorer are examples of several widely used browser applications. The IGM provides access to MEDLINE, a database which contains more than nine million articles from journals throughout the world, and 14 other databases. AIDSLINE, AIDSDRUGS, AIDSTRIALS, DIRLINE, HealthSTAR, HSRPROJ, HISTLINE, OLDMEDLINE, SDILINE, SPACELNE, BIOETHICSLINE, POPLINE, TOXLINE and ChemID, for example, are all available using the IGM.\nA problem with using search interfaces such as the IGM is that it requires a substantial amount of time to locate relevant articles. Time to search through databases for articles relevant to a particular subject is not a luxury many medical professionals can afford. Moreover, even if a person does manage to find the time to conduct a search, the articles are not provided at a time when they are immediately pertinent. For example, current systems do not provide a mechanism for providing articles to a doctor when the doctor is seeing patients.\nWhen a doctor sees a patient, information about that patient is typically entered into a medical chart. The medical chart becomes part of the patient's permanent medical history. In some instances the patient's medical chart is in electronic form. Electronic medical records provide doctors and other medical professionals with a simple way to store and retrieve information about a patient. The HBOC in Atlanta Ga. and the SMS in Malvern, Pa. for example are both companies that provide and maintain electronic medical record systems. Current electronic medical record systems do not, however, provide a way for the doctor to obtain journal articles about a particular subject by automatically pulling information directly from the patient's medical chart and querying a medical library for information"} +{"output_text": " includes a means for measuring the height of the sample in the sampling chamber and a means for measuring the height of the sample in the settling chamber. The control means is responsive to the height of the sample in the settling chamber and the height of the sample in the sampling chamber to control the feed of suspension to the sampling chamber.\nAnother example of sampling equipment for measuring sedimentation rate is that described in Parker et al U.S. Pat. No. 4,318,297, issued Mar.", "input_text": " operator must estimate the amount of change in the rate of addition of flocculant which will be needed to produce a desired change in upper boundary. If he overestimates or underestimates the change in rate required, the unit may become unstable and eventually have to be shut down to avoid overloading or the carryover of solids. The upper boundary therefore provides at best a visible means for assessing the state of the thickener or clarifier operation and, if it increases progressively, it may serve as an delayed warning that the capacity of the settler has been exceeded.\nAttempts have also been made to control the operation of a settler by sampling the incoming feed slurry to the feedwell at regular intervals downstream of the point at which flocculant is added to the feed slurry. The samples thus collected are passed to a laboratory sized gravity separation vessel where representative settling can take place. By sensing the interface level between the liquid and solid phases in the separation vessel and adjusting the rate of flocculant addition to the feed stream in accordance with variations in the level of the interface during operation of the system, it was hoped that the rate of addition of flocculant could be controlled automatically and that the flocculant consumption could be thereby substantially reduced. However, attempts to develop such a system in the past were abandoned because none was capable of providing reliable data necessary for the control and operation of a full size commercial settler.\nOne example of sampling equipment for measuring sedimentation rate is that described in Parker et al U.S. Pat. No. 4,318,296, issued Mar. 9, 1982. This system includes a sampling chamber for a sample to be tested, Elmer means for controlling a control means to stop the feed of suspension to the sampling chamber and means for retaining the height of the sample at a preselected level in the sampling chamber during a settling period. It also"} +{"output_text": "\nThe first plate 110 may be formed from a metal or from a conductively doped semiconductive layer. When formed on a semiconductive substrate using conventional CMOS processing techniques and employing a dielectric analogous to a FET gate dielectric, such capacitors may be referred to as MOS capacitors or MDS (metal-dielectric-semiconductor) capacitors, although the dielectric may or may not be an oxide or silicon dioxide, and the first 110 and second 120 plates may be semiconductive or other conductive material rather", "input_text": " charge in response to signals coupled to the first 115 and second 125 electrodes. It is generally desirable to form capacitors 100 together with CMOS circuit elements but without requiring modification of standard CMOS processes. This allows greater choice of foundry for CMOS IC manufacturing, simplifies production, and reduces expense in realizing ICs that include the capacitor 100.\nSome applications for capacitors 100 require larger breakdown voltages than are needed for most of the other circuit elements forming the IC. Such applications may include power supplies associated with programming circuitry for programmable or nonvolatile memory elements.\nThe first 110 and second 120 plates may be realized in a number of forms in ICs using standard CMOS processes. These forms can include metal or semiconductor layers comprising the capacitor plates 110, 120, separated by a layer forming the dielectric 130, or interdigitated conductive patterns comprising the capacitor plates 110, 120.\nCapacitors 100 formed using interdigitated conductive patterns for the capacitor plates 110, 120 tend to provide relatively high breakdown voltage. These also tend to be relatively large and to provide relatively little capacitance per unit area of the IC in which they are formed.\nCapacitors 100 may also be formed by using a first conductive plate 110, which may be relatively planar, formed on or in a substrate, a relatively planar dielectric 130 disposed atop the first plate 110 and a relatively planar second conductive plate 120 formed on and supported by the dielectric layer 130. The first plate 110 may be formed from metal or from a conductively doped semiconductive layer. When formed on a semiconductive substrate using conventional CMOS processing techniques and employing a dielectric analogous to a FET gate dielectric, such capacitors may be referred to as MOS capacitors or MDS (metal-dielectric-semiconductor) capacitors, although the dielectric may or may not be an oxide or silicon dioxide, and the first 110 and second 120 plates may be semiconductive or other conductive material rather than including metal."} +{"output_text": " the electron beam is not allowed to pass through the opening pattern.\nIn the case of using the electron projection lithography device, throughput can be greatly improved up to 35 pieces/hour compared with the electron beam direct writing method. Compared with the conventional photolithography, however, the throughput is lower, about xc2xd. In the case of the stencil-type reticle, since the opening pattern for passing the electron beam is provided, a xe2x80x9csquare", "input_text": " manufacturing an EPL mask is relatively small. However, if mismatching is present in membrane stress between an oxide film of an intermediate layer and silicon on the surface by execution of etching of back-side Si, which makes TAT longer, mask deformation may occur, causing a shift in projection position. This positional shift is prevented by adding boron or the like to an oxide film on the surface to generate tensile stress on the substrate surface as well, and reducing stress between the oxide film and the substrate. Both methods have own features different from each other as described above, and the preceding back etching method enabling TAT to be shortened is considered to be more suitable. The oxide film is removed after the execution of the back etching. Accordingly, membrane blanks for the reticle for electron beam projection are made (FIG. 3B). Then, circuit patterns are divided into predetermined subfields, and a resist pattern 301 is formed on the reticle for electron beam projection by a resist process (FIG. 3C). A predetermined pattern is formed by further carrying out dry etching. Lastly, the reticle for electron beam projection is made by carrying out cleaning (FIG. 3D). As described herein, the reticle having an opening pattern for passing the energy beam is called a stencil type.\nRepresentative features of the present invention can be summarized as follows.\nIn the case of using the electron projection lithography device, throughput can be greatly improved up to 35 pieces/hour compared with the electron beam direct writing method. Compared with the conventional photolithography, however, the throughput is lower, about xc2xd. In the case of the stencil-type reticle, since the opening pattern for passing the electron beam is provided, a xe2x80x9csquare-shapedxe2x80x9d pattern called a doughnut-type pattern cannot be included. This is because"} +{"output_text": ".\nIn the 3G CDMA system, the packet data traffic is transmitted in the form of packets. The packet data traffic is transmitted in the form of packets in the uplink direction. The packet data traffic is transmitted in the form of packets in the downlink direction. The packet data traffic is transmitted in the form of packets in the downlink direction. The packet data traffic is transmitted in the form of packets in the uplink direction. The packet data traffic is transmitted in the form of", "input_text": " clock signal at the symbol rate in response to the state transition signal and the peak detected signal to adjust the phase of the symbol clock signal. An analog-to-digital converter samples each of the baseband I and Q component signals at a predetermined rate to generate a first digital sample output stream from the baseband I component and a second digital sample output stream from the baseband Q component. Each digital data sample of the first digital sample output stream is indicative of a logic state of the baseband I component at a respective sample time and each digital data sample of the second digital sample output stream is indicative of a logic state of the baseband Q component at a respective sample time. The predetermined rate is at least twice the Nyquist frequency of the baseband I and Q component signals. A voter circuit receives the first and second digital data sample streams, and, in response to occurrence of the symbol clock signal, compares a current digital data sample of the first digital data sample output stream to prior and subsequent digital data samples of the first digital data sample output stream to provide an I state signal output and, in response to occurrence of the symbol clock signal, compares a current digital data sample of the second digital data sample output stream to prior and subsequent digital data samples of the second digital data sample output stream to provide a Q state signal output. This invention relates to packet transmission in multiple access communication systems.\nIn this patent document, the references referred to in square brackets, eg [1], are listed at the end of the disclosure.\nThe 3G CDMA system is supposed to support simultaneous voice/packet data/circuit data operations [1] with different QoS requirements. Since the traffic characteristics and the quality of service requirements for packet data services are quite different from those of voice traffic, system re-design in certain essential aspects is necessary for an integrated voice-data system without impacting voice quality or sacrificing data performance"} +{"output_text": " member pivotally mounted on said seat and having a downwardly extending portion, a receptacle pivotally mounted on said support member and having a downwardly extending portion, a tripping lever pivotally mounted on said support member and having a downwardly extending portion, a load responsive terminal at one end of said lever upon which a minimal marginal portion of the bottom of said receptacle is supported, free-turning rotating force multiplication motion transmitting means, and flexible means associated and operatively connected with said last", "input_text": " of the Invention\nThis invention relates generally to mechanized announcement or sound producing devices and more particularly to such a device as used with a toilet.\n2. Description of Related Art\nThe following art defines the present state of this field:\nFindley, Jr., U.S. Pat. No. 2,721,531 describes a toilet signal device for training infants comprising a toilet seat, a signal device, and actuator for said signal device terminating in a substantially hooked portion, hooked means extending from the other end of said toilet seat and carried by the latter, and a disposable trip supported at each end by said hooked portion of said actuator and said hook means.\nLee, U.S. Pat. No. 3,059,608 describes a training chair comprising a child's chair construction having a seat provided with an accommodation opening, a normally horizontal load-responsive downwardly tilting receptacle, means fixed to the chair structure in a plane below the seat pivotally suspending said receptacle below and in alignment with said opening with the upper portion normally spaced below the underneath side of said seat, a tripping lever constituting a load-responsive normally balanced beam and being provided with adjustable balancing means, means supporting said lever for pivotal movement, said lever and means being offset relative to a rearward tilting portion of said receptacle, a load responsive terminal at one end of said lever upon which a minimal marginal portion of the bottom of said receptacle is supported, free-turning rotating force multiplication motion transmitting means, and flexible means associated and operatively connected with said last named means providing forces for the actuation of a sound emitting signaling device mounted cooperatively on said chair.\nGarthofner, U.S. Pat. No. 3,172,390 describes a toilet training chair comprising supporting sides and a seat connected therebetween, said seat having an opening, a planar surfaced support"} +{"output_text": " high opacity.\nThe present invention is directed to a method of making a paper product. The method includes the steps of: (a) providing a papermaking furnish comprising a cellulosic fiber suspension; (b) adding to the furnish a quaternary ammonium compound; (c) adding to the furnish a fatty acid ester; and (d) forming the paper product from the furnish.\nThe present invention is also directed to a method of making a paper product. The method includes the steps of", "input_text": " solids content emulsion forms, the high application doses needed for yielding good opacity, the resultant loses in sheet strength properties and accompanying papermachine deposit issues are significant end-user issues that need improvement.\nThe present invention responds to this need via the discovery that quaternized alkanolamine fatty acid esters can be employed in papermaking operations to provide improved optical performance as compared to the prior art organic opacification aids currently being used, e.g., fatty amides of alkanoldiamines, quaternized versions of these fatty amides, and mixtures of fatty oils and various amine-esters. Use of the quaternized alkanolamine fatty acid esters, hereinafter more simply referred to as diester quats, also provides control over other aspects of the papermaking operation, e.g., decreasing inorganic filler/pigment amounts for the purposes of improving strength properties and/or decreasing paper grammage without a loss in opacity.\nQuaternized alkanolamine fatty acid ester compounds are known and their use in papermaking methods has been proposed in U.S. Pat. Nos. 5,217,576, 5,223,096, 5,240,562, 5,264,082, 5,415,737, and 5,427,696. Each of these patents centers around modifying paper properties in tissue and towel paper grades. This prior art teaches the use of various quaternary ammonium chemical softening compounds, which includes quaternized alkanolamine fatty acid esters. While this art teaches the use of these compounds as softening aids which impart a soft feel and more adsorbent paper in the stated paper areas, there is absolutely no recognition of the use of the quaternized alkanolamine fatty acid esters as a wet-end papermaking additive for improving opacity. In fact, opacity is not even an issue with these grades, since tissue and towel paper grades are not commonly used or designed for"} +{"output_text": " a subscriber to a wireless handset. The subscriber's wireless handset is connected to a wireless communication network via a wireless base station. The wireless base station is connected to a central database via a wireline network. The central database provides data instructions to the wireless base station to forward incoming calls to the wireless handset.\nU.S. Pat. No. 5,621,579, to KAY et al., discloses a system for providing specialized calling features to stations connected to a", "input_text": " the subscriber, and performing final call dispositions other than routing to the telephone number provided by the subscriber. Processing of the call traffic information dynamically changes the subscriber's call routing program to reduce the number of blocked calls to the subscriber's telephone numbers.\nU.S. Pat. No. 5,247,571, to KAY et al., discloses an Area Wide Centrex system to provide specialized calling features to stations connected to a plurality of central offices. Each of the central office switching points connects to a number of local telephone lines. The features are extended to the local telephone lines by taking the programming intelligence out of the central offices and moving it into a database located at a centralized location, such as an SCP. Service features are, controlled by the central database and are changed by reprogramming the service logic located at the central database. A variety of service features are provided including a work at home service that enables a user of a private network access from a home telephone and access authorization to increase the security of the private network.\nU.S. Pat. No. 5,353,331, to EMERY et al., discloses an AIN system which connects to, and controls processing of, calls to a subscriber's wireless handset via a home base station or wireless communication network. In response to calls directed to the subscriber's wireless handset, the AIN determines where the handset is located using a central database and routes the call to that location. The incoming call can be routed directly to the handset, blocked, or routed to an alternate termination point. In response to calls from the handset, the central database provides data instructions to the landline network to extend a requested special service to the subscriber.\nU.S. Pat. No. 5,592,541, to FLEISCHER, III et al., discloses an AIN network environment for forwarding incoming calls by"} +{"output_text": " the clock enable signal and may provide the release signal having the clock enable logic level and the clock disable logic level. The clock enable signal may be received after the first refresh release command and the clock enable signal may be received after the second refresh release command.\nAccording to another aspect of the embodiments, a semiconductor memory device may include a clock generator circuit coupled to receive an external clock and providing an internal clock. A set signal output circuit may be coupled to receive a plurality of input signals. The set", "input_text": " a complementary logic gate input buffer.\nAccording to another aspect of the embodiments, the status latch signal output circuit may include a reset-set (RS) flip-flop set in response to the set signal and reset in response to the release signal.\nAccording to another aspect of the embodiments, the semiconductor memory device may be a dynamic random access memory and the refresh mode may be a self refresh mode.\nAccording to another aspect of the embodiments, a semiconductor memory device may include a clock generator circuit coupled to receive an external clock and providing an internal clock. A set signal output circuit may be coupled to receive a plurality of input signals. The set signal output circuit may provide a set signal that may set the operation of the semiconductor memory device to a refresh mode based on the plurality of input signals received in synchronism with the internal clock indicating a refresh set command. A release signal output circuit may be coupled to receive the plurality of input signals. The release signal output circuit may provide a release signal that releases the refresh mode based on the plurality of input signals in synchronism with the internal clock indicating a refresh release command. The refresh release command may include a first refresh release command and a second refresh release command and the second refresh release command may be received after the first refresh release command. A status latch signal output circuit may be coupled to receive the set signal and the release signal. The status latch signal output circuit may provide a status latch signal indicating a refresh mode state. The refresh mode state may be set in response to the set signal and the refresh mode may be released in response to the reset signal. An enable circuit may be coupled to receive a clock enable signal and may provide an enable signal having a clock enable logic level and a clock disable logic level and may enable the internal clock to be generated based on an external clock when at the clock enable logic level. The release signal output circuit may be coupled to receive"} +{"output_text": "ied) to remove excess water. The wrung leather is then dried 30. The drying process is performed in a series of long nip presses 40. The long nip presses 40 are used to remove water from the leather. The long nip presses 40 are also used to impart a uniform thickness to the leather.\nThe long nip presses 40 are typically constructed of a pair of opposing rollers 42, 44. The rollers 42, 44 are typically constructed of a steel or other metal. The", "input_text": " plastics manufacturing or as low sulfur biofuel. The lignin and other organic material is preferably precipitated from cellulose waste liquors with solids content in the range of 3-30% supported by the action of an acid, preferably carbon dioxide recovered from gases with the origin from combustion of cellulose spent liquor. 1. Field of the Invention\nThe present invention relates to the leather tanning arts. More specifically, the present invention relates to a long nip press for drying tanned leather hides.\n2. Description of the Prior Art\nLeather tanning is the process of converting raw hides or skins into leather. Hides and skins have the ability to absorb tannic acid and other chemicals that prevent them from decaying. FIG. 1 is a general flow diagram of the leather tanning and finishing process. The raw hides are \u201ccured,\u201d a process which involves salting and/or drying the hide once its been stripped from the animal.\nThe first steps, commonly referred to as the \u201cbeamhouse\u201d operations 10, prepare the hides for tanning 20. The cured hides are trimmed and soaked to remove salt and other solids, and to restore moisture lost during curing. The hides are then fleshed to remove excess tissue and impart a uniform thickness. The hair is removed from the hides by soaking in a lime/water mixture to loosen the hairs and then mechanically removing the loosened hairs.\nThese prepared hides are now ready for the tanning operations 20. Tanning may be performed using either trivalent chromium salts or vegetable tannins extracted from specific tree barks. Chrome tanned leather is softer, more pliable, and quicker to produce than vegetable tanned leather. Chrome tanning is performed using a one-bath process that is based on the reaction between the hide and the chromium salt.\nFollowing chrome tanning, the tanned leather is wrung (or samm"} +{"output_text": " materials, a composite material is formed into a desired shape and then the composite material is cured or hardened. The curing or hardening of the composite material is typically accomplished by exposing the composite material to a curing or hardening agent, such as a chemical curing agent or a radiation curing agent.\nIn the manufacture of articles and components from composite materials, it is often desirable to apply a protective coating to the composite material to protect the composite material from damage during the manufacturing process. For example, it is", "input_text": " earth elements and Zn, Mg and Ni. Cobalt, manganese and aluminum are common additives.\nThe components of the NIMH battery include nickel metal grid, Ni(OH)2, nickel coated iron, potassium hydroxide electrolyte, and most importantly a nickel metal alloy powder of up to 25-30% by weight. This alloy powder has been developed to absorb considerable hydrogen and is the source of the descriptor \u201cnickel metal hydride\u201d battery. Under charging conditions this nickel alloy absorbs significant amounts of hydrogen as the metal hydride is formed electrochemically. Under battery discharge conditions this absorbed hydrogen reacts electrochemically back to hydroxide and water providing the electrical current of the battery. The currently most well known nickel alloy used is termed AB5 which is an alloy consisting of one part misch metal (mostly lanthanum or REM) to five parts nickel on a mole basis\u2014theoretically 32.1% (REM) on a weight basis. Therefore the naturally occurring rare earth oxide mixture is used to form the misch metal which avoids the expense of separating the rare earth oxides into the individual elements before reducing them to the mixed metal and not to the pure metal such as pure lanthanum metal. This metal mixture is used which is called misch metal. Therefore the AB5 alloy is an alloy of a mixture of lanthanum group metals and nickel with some cobalt and other metals added in small amounts for optimized hydrogen formation and storage. This AB5 component is the most expensive raw material cost for this battery. The use of fiber reinforced/resin matrix composite materials in the manufacture of articles and components in becoming increasingly widespread in a number of industries, including the aircraft industry. Such composite materials include, for example, graphite fiber reinforced/epoxy resin matrix materials and glass fiber reinforced/polyimide resin matrix materials. In a common manufacturing procedure for producing articles and components from composite"} +{"output_text": " the capacity of a switch is the number of subscriber lines into the switch. The number of subscriber lines into a switch is limited by the number of line concentrators in the switch.\nThe number of line concentrators in a switch is limited by the number of line concentrators that can be physically installed in the switch. The number of line concentrators that can be installed in a switch is limited by the number of line concentrators that can be physically installed in the switch. The number of line concentr", "input_text": " the interoffice switching and transport network, and the originating end user switch. The point of greatest congestion is the switch serving the ISP, because many different users call into the ISP simultaneously.\nLECs have engineered and sized their networks based on assumptions about voice traffic. In particular, several decades of data collection and research by AT&T, Bellcore, and others has shown that an average voice call lasts 3-5 minutes, and that the distribution between long and short calls follows a well-established curve. Because very few people stay on the line for very long periods of time, there is no need for LEC switches to support all users of the switch being connected simultaneously. Instead, LEC switches are generally divided into \u201cline units\u201d or \u201cline concentrators\u201d with concentration ratios of typically between 4:1 and 8:1. In other words, there are between four and eight users for every call path going through the switch. Call blockage on the voice network tends to be negligible because a significant percentage of users are unlikely to be connected simultaneously.\nThe distribution of Internet calls differs significantly from voice calls. In particular, Internet users tend to stay on the line substantially longer than voice users.\nBecause LEC networks have not been designed for these longer usage patterns, heavy Internet usage can result in switches being unable to handle the load (\u201cswitch congestion\u201d). Internet connections tie up a end-to-end call path through the PSTN for the duration of the call. When the average hold time of calls through a switch increases significantly, the likelihood of all available call paths through the switch being in simultaneous use also goes up. If a particular line unit has an 8:1 concentration ratio, only one eighth of the subscriber lines into that line unit need to be connected at one time in order to block all further calls.\nBecause of the relatively short average duration of voice calls, the primary limiting factor on"} +{"output_text": " is started until the paper is accelerated for reversal, or by measuring the actual time from when the paper is accelerated for reversal to when the paper is decelerated for image formation.\nIn the above aspect, the position or the timing at which the paper is accelerated for reversal may be determined by measuring the actual time from when the paper is accelerated for reversal to when the paper is decelerated for image formation.\nIn the above aspect, the position or the timing at which the paper is accelerated", "input_text": " the present invention, there is provided an image forming apparatus comprising first paper supply means for supplying paper one sheet at a time; second paper supply means for receiving the paper from the first paper supply means and conveying the paper to an image formation section; the image formation section for forming an image by fixing a toner image after the toner image has been transferred to the paper supplied from the second paper supply means; and circulatory conveyance means for reversibly conveying and circulatorily conveying the paper, on which the image has been formed on a first surface thereof, to the second paper supply means once again, in order to form an image on a second surface of the paper, on which the image has been formed on the first surface thereof by the image formation section; after an image has been sequentially formed on the first surfaces of a predetermined number of sheets of the paper by the image formation section, the sheets of paper being reversibly conveyed and circulatorily conveyed to the second paper supply means by the circulatory conveyance means, and then image being successively formed on the second surfaces of the paper, on which the image has been formed on the first surfaces thereof; wherein a position at which the paper, on which the image has been formed on the first surface thereof, is accelerated from the paper conveyance speed during image formation in order for reversal and a position at which the post-reversal paper conveyance speed is decelerated in order to form the image on the second surface of the paper are alterable; and wherein images are formed on both sides of the paper without effecting control of the operation of the second paper supply means restricted by the number of image formed sheets per unit of time.\nIn the above aspect, the position or the timing at which the paper is accelerated for reversal may be determined by measuring the actual time from when conveyance of the paper from the second supply means to the image formation section"} +{"output_text": " layer. The bisazo series compound is represented by the following formula: ##STR2## wherein R.sub.1 and R.sub.2 are the same or different and each represents a hydrogen atom, a halogen atom, a cyano group, a nitro group, a C.sub.1-4 alkyl group, a C.sub.1-4 alkoxy group, a C.sub.1-4 alkylthio group, a C.sub.1-4 alkyl", "input_text": "800, 4,309,611, 4,418,133, 4,293,628, 4,427,753, 4,495,264, 4,359,513, 3,898,084, and Japanese Patent Publication 60-111247.\nU.S. Pat. No. 4,755,443 discloses a photoreceptor for electrophotography which comprises a charge carrier generating material and charge transport material wherein one charge generating material is a metal phthalocyanine or a metal-free phthalocyanine. The layer containing the generator material also contains an organic amine. Other carrier generating substances can be used in combination with the phthalocyanine generator material, including azo pigments, anthraquinone dyes, perylene dyes, polycyclic quinone dyes, and methine stearate pigments.\nU.S. Pat. No. 4,424,266 discloses an electrophotographic photosensitive element having a conductive support and a photosensitive layer comprising a carrier generating phase layer containing a carrier generating material selected from the group consisting of perylene dyes, polycyclic quinones, and azo dyes, and a carrier transporting phase layer containing a hydrazone carrier transporting material. The carrier generator materials can be used either singly or in combination.\nU.S. Pat. No. 4,882,254, the disclosure of which is totally incorporated herein by reference, discloses a layered photoresponsive imaging member which comprises a supporting substrate, a photogenerator layer comprising a mixture of first and second pigments, and an aryl amine hole transport layer. The mixture of pigments is selected from perylenes and phthalocyanines, polycyclic quinones and phthalocyanines, or perinones and phthalocyanines.\nJapanese Patent Publication J01-198-763 discloses an electrophotographic photoreceptor containing a bisazo series compound in a photosensitive"} +{"output_text": " a switched-capacitor amplifier circuit, the OpAmp gain and DC input offset voltage are sampled twice in each clock period. The OpAmp gain is sampled by a first OpAmp sample and a second OpAmp sample. The DC input offset voltage is sampled by a first OpAmp sample and a second OpAmp sample. The OpAmp gain and DC input offset voltage are then subtracted from the first OpAmp sample and the second OpAmp sample to produce", "input_text": " Publication No. 20020195538 to Ellson et al. describes the use of acoustic ejection to selectively deposit analysis-enhancing fluid according to the surface characteristics of the cellular samples.\nAs alluded to above, conventional analysis-enhancing fluids for use in mass spectrometry are typically comprised of a mass spectrometry matrix material dissolved in a volatile carrier fluid. Once deposited on a sample surface, the carrier fluid is evaporated, thereby allowing the matrix material to precipitate and crystallize with the sample. It has recently been discovered, however, that such conventional analysis-enhancing fluids are not optimal for use in mass spectrometry when dispensed as low-volume droplets under ordinary dispensing conditions, because such fluids do not allow the matrix material to properly crystallize with the sample.\nAccordingly, there is a need for methods and systems that overcome the disadvantages and limitations associated with previously known technologies. 1. Field\nThis invention relates generally to switched-capacitor circuits and more specifically to a switched-capacitor amplifier circuit that may be disposed on an integrated circuit.\n2. Related Art\nA switched-capacitor circuit is a circuit that provides signals that are discrete in time and continuous in voltage amplitude. Correlated double sampling (CDS) is a technique used with switched-capacitor circuits to measure small, slowly changing signals in the presence of large amounts of low frequency (1/f) noise and direct current (DC) input offset voltage. The CDS technique is a particular type of auto-zero technique, in which noise and a DC input offset voltage are sampled twice in each clock period. Switched-capacitor amplifiers often use the CDS technique to compensate for non-idealities of an operational amplifier (OpAmp) in the switched-capacitor circuit such as finite open-loop gain (hereinafter \u201cOpAmp gain\u201d) and DC input offset voltage. In"} +{"output_text": "er (WDM) 131a. The subscriber unit 105b comprises a wavelength division multiplexer/demultiplexer (WDM) 131b connected to the fiber 104b; a receiving photodiode (PD) 132b for receiving a wavelength band of a video signal separated by the wavelength division multiplexer/demultiplexer (WDM) 131b and for outputting it as an electric signal; a video receiver 133b supplied with the electric signal; and a transmitting and receiving", "input_text": " a subscriber unit connected to one of the optical fibers 104a-104c. Since the split number of a single star coupler is 32 at present, the total of 32 subscriber units can be connected to each star coupler by connecting them to the split output terminals via the optical fibers 104a-104c.... \nThe central office unit 101 comprises a transmitting laser diode (LD) 112 for outputting a video signal generated by a video signal generator 111 in the form of an optical signal; a wavelength division multiplexer/demultiplexer (WDM) 113 supplied with the output of the transmitting laser diode (LD) 112 and the output of the transmitting and receiving section 114; an electric signal multiplexer/demultiplexer 115; and a processing section 116. The transmitting and receiving section 114 includes a wavelength division multiplexer/demultiplexer (WDM) 121; a receiving photodiode (PD) 123 for converting an optical signal supplied from the wavelength division multiplexer/demultiplexer (WDM) 121 into an electric signal; a transmitting laser diode (LD) 122 for converting an electric signal to an optical signal; and a signal processor 124. The processing section 116 includes a signal processor 117, a transmitting laser diode (LD) 118 and a receiving photodiode (PD) 119.\nThe subscriber unit 105a comprises a wavelength division multiplexer/demultiplexer (WDM) 131a connected to the fiber 104a; a receiving photodiode (PD) 132a for receiving a wavelength band of a video signal separated by the wavelength division multiplexer/demultiplexer (WDM) 131a and for outputting it as an electric signal; a video receiver 133a supplied with the electric signal; and a transmitting and receiving section 134a supplied with signals other than the video signal separated by the wavelength division multiplexer/demultiplex"} +{"output_text": " ring resonator with cylindrical mirrors. U.S. Pat. No. 4,137,914, issued to Hill, et al., on Jan. 30, 1979, entitled RING LASER WITH A MULTIPLE-WAVE REGENERATIVE AMPLIFIER, relates to a ring laser with a multiple-wave regenerative amplifier. U.S. Pat. No. 4,137,916, issued to Hill, et al., on Jan. 30, 1979", "input_text": " Feb. 23, 1971, entitled OPTICAL COMMUNICATION ARRANGEMENT UTILIZING A MULTIMODE OPTICAL REGENERATIVE AMPLIFIER FOR PILOT FREQUENCY AMPLIFICATION, relates to an optical communication system: with a ring amplifier. U.S. Pat. No. 3,646,468, issued to Buczek, et al. on Feb. 29, 1972 relates to a laser system with a low power oscillator, a high power oscillator and a resonance adjustment means. U.S. Pat. No. 3,646,469, issued to Buczek, et al. on Feb. 29, 1097, entitled TRAVELLING WAVE REGENERATIVE LASER AMPLIFIER, relates to a laser system like that of the '468 Buczek patent with a means for locking the resonant frequency of the amplifier to frequency of the output of the oscillator. U.S. Pat. No. 3,969,685, issued to Chenausky on Jul. 13, 1976, entitled ENHANCED RADIATION COUPLING FROM UNSTABLE LASER RESONATORS relates to coupling energy from a gain medium in an unstable resonator to provide a large fraction of the energy in the central lobe of the far field. U.S. Pat. No. 4,107,628, issued tot Hill, et al., on Aug. 15, 1978, entitled CW BRILLOUIN RING LASER, relates to a Brillouin scattering ring laser, with an acousto-optical element modulating the scattering frequency. U.S. Pat. No. 4,135,787, issued to McLafferty on Jan. 23, 1979, entitled UNSTABLE RING RESONATOR WITH CYLINDRICAL MIRRORS, relates to an unstable"} +{"output_text": " photographing method of a camera-enabled portable terminal.\nThe above information is presented as background information only to assist with an understanding of the present disclosure. No determination has been made, and no assertion is made, as to whether any of the above might be applicable as prior art with regard to the present invention.", "input_text": " at step S19. If it is determined that the number of images is not equal to the windows, the portable terminal displays the status indication icon representing the present status of the respective windows, as shown in FIG. 2A and FIG. 2B, and displays the preview image on the whole screen at step S21.\nOn the other hand, if it is determined that the number of images is equal to the number of windows, the portable terminal displays the images filled out in the respective windows on the whole screen as shown in FIG. 2C at step S23. After all the images for filling out the windows are captured, the portable terminal determines whether or not a frame refresh command is inputted while the images are displayed on the screen at step S25. If there is no frame refresh command, the portable terminal determines whether or not a save command is inputted at step S29. If the save command is inputted, the portable terminal saves the images filled out in the windows as a single frame at step S31.\nIf the frame refreshment command is inputted at step S25, the portable terminal clears the images displayed on the screen at step S27 and carries out steps of S15 to S25.\nHowever, the conventional multi-image photographing method of the camera-enabled portable terminal has a drawback in that it is impossible to check the respective images before all of the images for filling out the windows are captured. This is because all the images captured for forming a single frame are displayed and checked after all the images are captured.\nAlso, in the conventional multi-image photographing method of the camera-enabled portable terminal, all the images captured for filling out the frame are discarded and retaken even when only one image is required to be changed since a refresh is performed on a frame basis.\nAccordingly, there is a need for an improved multi-image"} +{"output_text": " received prior to the jog signal received from interface #1, the jog status field is set to a value of \u201c0\u201d and the jog request signal is not executed.\nIn this scenario, the jog status field is set to a value of \u201c0\u201d and the jog request signal is not executed. However, the jog request signal is not executed because the jog status field is set to a value of \u201c0\u201d. In this scenario, the jog request signal is not executed because the jog status field", "input_text": "., placement of a machine head, a jog command is a command resulting in a predefined amount of displacement (e.g., motion) for each generation of a jog request. For example, pressing of a \u201cmove\u201d button can cause a machine head to move continuously in a certain direction while the move button is pressed. However, a \u201cjog\u201d button can be programmed such that only a predefined degree of displacement is effected for a single press of the button and the button has to be \u201creleased\u201d (e.g., finger pressure removed from button) before the button can be re-pressed for the next jog request to be effected. For example, a machine can be configured, such that when the operator executes (e.g., presses) the jog button the machine head travels a prescribed distance, e.g., ten thousands of an inch, one millimeter, 10 microns, etc.\nHowever, if signals are being received from a plurality of interfaces a problem can occur where a controller, associated with the plurality of interfaces, is using a single jog status field to determine whether a machine jog is to be performed. For example, in a situation where jog request signals are being generated by any or all of three interfaces coupled to a controller, and the jog request signals each update a single jog status field, then a jog status received from a first interface can be overwritten by a jog status subsequently received from a second interface. In this scenario, a situation can occur where a Jog signal is received from interface #1, but a subsequent no jog signal is received from interface #2. Upon receiving a jog signal from interface #1 a jog status field is set to a value of \u201c1\u201d, however, upon receiving the subsequent no jog signal from interface #2, the jog status field is set to a value of \u201c0\u201d. Given that the no jog signal received from interface #2 is"} +{"output_text": " a plurality of computers, discloses a technique for collecting log data from a plurality of computers by a single computer.\nHowever, in the conventional technique disclosed in Japanese Published Unexamined Patent Application No. Hei 5-250229, since the single computer collects log data from a plurality of computers, the load on the computer is increased. Further, since the computer collects log data from a plurality of computers, the load on the computer is increased.\nIt is an object of the present invention to", "input_text": " data processing by the plurality of servers in cooperation with each other, as well as local data processing within each server. In this distributed data processing system, if a software error occurs in one server when data processing is performed, error analysis information of the server is occasionally insufficient to find the cause of the error. Further, since almost all the error analysis information necessary for investigation into the cause of error such as trace information of a software product is managed by a wrap around function, if saving of error analysis information into a saving file is delayed, the important information may be lost.\nGenerally, when a system error including a software bug occurs, an error alarm message notifying the occurrence of error is displayed on a monitoring terminal apparatus. Conventionally, when an error alarm message is displayed on the display of the monitoring terminal, an operator of the monitoring terminal notifies a system administrator (otherwise, a manufacturer or the like) of the occurrence of error. Then, the system administrator checks the content of the error and the server where the error has occurred (hereinafter, referred to as a xe2x80x9ctroubled serverxe2x80x9d), and starts to collect error analysis information necessary for investigation into the cause of the error.\nIf an error occurs when data processing is performed by a plurality of servers in cooperation with each other, it is necessary to collect error analysis information from not only the troubled server but also the other servers than the troubled server. In this case, it is necessary to specify servers from which error analysis information is to be collected, and instruct these servers to quickly save error analysis information or transfer the information to the monitoring terminal apparatus.\nAs a conventional technique for collecting log data from a plurality of computers, Japanese Published Unexamined Patent Application No. Hei 5-250229, for resolving the problem of an increase in load caused by automatic log-data transmission request to"} +{"output_text": "body exposure is estimated to be between 1.5 and 2 Gy (2). The dose required to cause a fatal cancer is estimated to be between 1 and 2 Gy (3). The dose required to cause a fatal cancer is estimated to be between 1 and 2 Gy (3). The dose required to cause a fatal cancer is estimated to be between 1 and 2 Gy (3). The dose required to cause a fatal cancer is estimated to be between 1 and 2 Gy (3). The dose required to", "input_text": ")-(c), above, to improving the therapeutic ratio. A combination of improved physical targeting, fractionation and radiomodifiers could transform the intent in some radiotherapy situations from palliative to curative. For curative schedules, successful application of radiomodifiers would relax the requirement for fractionation and hence reduce overall costs of treatment, which to a large extent is proportional to the number of treatment fractions per patient.\nA particularly important role for radioprotectors has emerged from the recognition that accelerated repopulation of tumour cells during radiotherapy can seriously compromise the effectiveness of treatment. The main consequences of this have been as follows: (i) The development of accelerated treatment schedules to reduce the overall time of radiotherapy treatment. In such accelerated schedules, acute reactions are a particular problem. For example, acute oral mucositis in head and neck cancer patients indicates a clear need for radioprotectors. (ii) The recognition that the interruption of radiotherapy treatment due to normal tissue reactions will reduce the probability of tumour control. Accordingly, the use of radioprotectors to prevent toxicity-induced treatment interruption would be clearly beneficial. \nThe events of 11 Sep. 2001 prompted assessments of vulnerability to many types of terrorism scenarios, amongst which is a collection described as radiological terrorism. An example is the so-called \u201cdirty bomb\u201d involving dispersal of some form a radioactivity with conventional explosive. Whilst attention is focused on the acute radiation syndrome (ARS; also referred to as \u201cradiation sickness\u201d), which describes the consequences of whole-body exposure to radiation doses greater than 1 Gy, there are also concerns about the longer-term effects of low doses, namely radiation-induced mutagenesis and carcinogenesis (1). This general situation, and the realisation that no prophylactic agents are available to provide protection against exposure to ionising radiation has generated significant research and political activity.\nThe mean lethal dose of radiation required to kill 50% of humans 60 days after whole-"} +{"output_text": " inspection. The single laser approach is not suitable for the analysis of the defects with low contrast in thermophysical parameters in relation to base material under inspection. \nThe above limitations of the conventional active thermography based on single laser are overcome by the present invention. The present invention provides a method and apparatus for detecting defects in a material, such as cracks, delaminations, and other defects, using a laser beam. The present invention is particularly useful for detecting defects in a material, such as", "input_text": " the possibility of manipulating the shape of the source intensity with a constant output which clearly relates to forced diffusion thermographic instrument. Similar concept was also disclosed which uses the line-scanning method to heat and measure the sample with a photothermal test camera while the system design also allows manipulation of the laser beam shape. The contribution of ideas over the years prompted the introduction of algorithm calculation to enhance the capability of such systems. However, they only mention about the detection of high contrast defects such as cracks and delaminations, but none was found to include the detection of low-contrast defect such as minor heat damage.\nThere are several limitations of the conventional active thermography based on flash lamps, which limit its application only to the defects with high contrast in thermophysical parameters in relation to base material under inspection: It is too challenging to achieve uniform illumination of the material, which introduces lateral temperature gradients that will dominate the IR image. Even if the uniformity of illumination can be achieved, it is practically impossible to avoid variation of the light absorption at the surface, which will depend on material composition, surface structure and finishing and presence of surface contamination. Even after the flash is applied, the glow from the lamp stays strong for several seconds and is reflected from the sample into the camera. This makes it impossible to use the thermography at early stages of thermal transition. Application in ambient condition causes cooling of the surface through convection, which contributes significantly after 10 seconds of observation. \nSome of the limitations of the thermography based on single laser are listed below: It requires scanning of the single beam, which restricts the analysis to a relatively small area of the sample that can be examined in a reasonably short time. Single laser approach makes it difficult to compare two different spots within the area of interest. The reason for this is that the analysis of one spot inevitably leads to the temperature increase in the whole part under"} +{"output_text": " and a lower base. The base includes a pedestal for supporting a wafer substrate. The pedestal is typically made of a material that is compatible with the deposition process. For example, the pedestal may be made of quartz, which is compatible with the deposition process. The pedestal is typically coupled to a gas distribution plate that is coupled to a gas source. The gas distribution plate is typically made of a material that is compatible with the deposition process. For example, the gas distribution plate may be", "input_text": ". 1. Field of Invention\nThe present invention relates to semiconductor processing, and in particular, to chemical vapor deposition in a high density plasma reactor.\n2. Related Art\nHigh density plasma (HDP) chemical vapor deposition (CVD) processes are used in the fabrication of integrated circuits for depositing films on a substrate. One application of an HDP CVD process is to fill gaps on a semiconductor device having high aspect ratios (e.g., about 2.5:1 or greater) and close spacing (e.g., about 0.25 xcexcm or less). Existing HDP CVD processes typically employ deposition with a process gas mixture that includes oxygen, silane, and inert gases, such as argon, to achieve simultaneous dielectric etching and deposition.\nIn an HDP process, RF bias is applied to a wafer substrate in a reaction chamber. As a result, the flux of deposition precursors is perpendicular to the wafer, and the net film growth occurs perpendicularly to the bottom of the feature. Some of the gas molecules (particularly argon) are ionized in the plasma and accelerate toward the wafer surface when the RF bias is applied to the substrate. Material is thereby sputtered when the ions strike the surface. As a result, dielectric material deposited on the wafer surface is simultaneously sputter-etched to help keep gaps open during the deposition process, which allows higher aspect ratio gaps to be filled.\nAn important goal in HDP deposition of these and other layers is to deposit a film of uniform thickness across the surface of a substrate and across different batches of substrates. One factor mitigating against uniform deposition is dopant concentrations in the processing environment. In HDP CVD processes, this is important because the reactor can act either as a sink or a source for dopants that affect the growth rate on the wafer.\nA typical HDP CVD reactor includes a reaction chamber having an upper lid"} +{"output_text": "\nThe device according to the invention is particularly advantageous in that it can be used to fill a storage reservoir of any shape and size, and that it can be used to fill a storage reservoir of any shape and size, and that it can be used to fill a storage reservoir of any shape and size, and that it can be used to fill a storage reservoir of any shape and size, and that it can be used to fill a storage reservoir of any shape and size, and that it can be", "input_text": " is the case in the document DE-A-9 842 273 in which the hopper of a storage reservoir dispenses a batch of tablets under gravity into a temporary reservoir formed in the distributing device, upstream of rectilinear distributing brushes, and in which the surplus product is sucked up and recycled.\nThe object of the present invention is to remedy these drawbacks by providing a device affording complete and regular filling, without damaging the dispensed tablets, and which can very rapidly be transferred from one installation to another, while being able to receive tablets of different shapes and dimensions, thereby improving the loading schedule of each packaging installation, and that of a plant.\nTo this end, the device according to the invention consists of a single enclosure containing and carrying the storage reservoir, the dispensing means, and the distributing means consisting of cylindrical and rotary brushes. The dispensing means consists of two parallel flaps which, forming the bottom of the reservoir and delimiting a buffer bay, are inclined with respect to the horizontal, downward in the upstream sense, come into closure contact against the upstream transverse wall of the enclosure, near to the blister sheet, are translationally mobile and are linked, by their posterior ends, to independent means able to move them in one direction or the reverse, the control of the means of opening and closing each flap reacting, in respect of the upper flap, to a volumetric sensor detecting the quantity of tablets accumulated in the buffer bay and, in respect of the lower flap, to the sensor disposed above the travelling sheet, upstream of the first brush, and detecting the quantity of tablets accumulating against the first brush.\nWith this device, when the quantity of tablets accumulated against the first brush reaches a specified minimum volume, the corresponding sensor triggers the sliding of the lower flap and thus causes all or some of the content of the buffer bay to be emptied in small waves as close as possible to the filling zone."} +{"output_text": ", the process is too random to allow for the selection of improvements, since neutral mutations are also introduced in the process.\nFinally, repeated cycles of cassette mutagenesis will also lead to the accumulation of neutral mutations, which can affect, for example, immunogenicity but not binding affinity.\nThus, cassette mutagenesis was found to be too random to allow for the block changes that are required for continued sequence evolution (1, 2).\nIn contrast to the error-prone PCR and cassette mutagenesis, the", "input_text": " selection round followed by grouping into families, arbitrarily choosing a single family, and reducing it to a consensus motif, which is resynthesized and reinserted into a single gene followed by additional selection. This process constitutes a statistical bottleneck, it is labor intensive and not practical for many rounds of mutagenesis.\nError-prone PCR and oligonucleotide-directed mutagenesis are thus useful for single cycles of sequence fine tuning but rapidly become limiting when applied for multiple cycles.\nError-prone PCR can be used to mutagenize a mixture of fragments of unknown sequence (11, 12). However, the published error-prone PCR protocols (11, 12) suffer from a low processivity of the polymerase. Therefore, the protocol is very difficult to employ for the random mutagenesis of an average-sized gene. This inability limits the practical application of error-prone PCR.\nAnother serious limitation of error-prone PCR is that the rate of down-mutations grows with the information content of the sequence. At a certain information content, library size, and mutagenesis rate, the balance of down-mutations to up-mutations will statistically prevent the selection of further improvements (statistical ceiling).\nFinally, repeated cycles of error-prone PCR will also lead to the accumulation of neutral mutations, which can affect, for example, immunogenicity but not binding affinity.\nThus error-prone PCR was found to be too gradual to allow the block changes that are required for continued sequence evolution (1, 2).\nIn cassette mutagenesis, a sequence block of a single template is typically replaced by a (partially) randomized sequence. Therefore, the maximum information content that can be obtained is statistically limited by the number of random sequences (i.e., library size). This constitutes a statistical bottleneck, eliminating other sequence families which are not currently best, but which may have greater long term potential.\nFurther"} +{"output_text": "The present invention provides a method and apparatus for forming an end from a sheet of end material in a single press. The sheet of end material is fed from a curler to a first press, where it is formed into a first end. The first end is fed to a second press, where it is formed into a second end. The second end is fed to a third press, where it is formed into a third end. The third end is fed to a fourth press, where it is formed", "input_text": " is required. The system requires two balancers and a total of four presses, all of which are very costly machines. Further, the layout of the equipment requires a large amount of space and therefore a large building site, which increases the cost. Construction of such a building and providing heating and cooling also increases costs. Furthermore, the sheer number of presses and the balancers results in a system that consumes a lot of electrical power during operation, and a large number or operators and mechanics, increasing the costs further. While the system of FIG. 1 could be modified somewhat by reducing the number of conversion presses or eliminating the dryers, the basic architecture of the prior art system based on shell presses, conversion presses, balancers and extensive track-work is a very capital, space, energy and labor intensive system.\nThe present invention provides a substantial improvement over the end manufacturing system of the general type shown in FIG. 1. It also presents solves problems inherent in the Buhrke and Herrmann systems. As described below, the present invention provides for the formation of an end from a sheet of end material in a single press, thereby completely eliminating much of the track work, balancers, extra presses, and space and capital requirements of the system of FIG. 1. Furthermore, there is no need for any balancers in the present system since there is only one press and the ends are fed directly from the curlers to the liners and to the bagging station. The cost savings to install a new end manufacturing system of the present invention, as compared to a new end making system in accordance with the prior art approach of FIG. 1, with the same capacity, is in the order of many millions of dollars. Further, the end making system of the present invention is particularly suitable to smaller scale implementations, but can be modularized to increase capacity without requiring substantial increases in floor space or capital investment.\n"} +{"output_text": " system for providing multimedia information to a client.\nThe Internet is a worldwide interconnection of computer networks that communicate using a common protocol. Millions of computers, from low end personal computers to high-end super computers are coupled to the Internet. The Internet provides a vast amount of information that is constantly changing and that is available to anyone with an Internet connection.\nThe World Wide Web (the Web) is a collection of formatted hypertext pages located on numerous computers around the world that are logically connected", "input_text": " a first point offset a predetermined amount ahead of a first R wave and a second point offset a predetermined amount behind the first R wave is determined. The highest peak within the first interval is then identified. This peak is the A wave.\nOnce the A wave is identified, a second R wave in the ECG, subsequent in time to the first R wave, is identified. Once the second R wave is determined, a second interval in the atrial pressure waveform that extends from the second point to a third point is established. The third point is positioned at a distance ahead (subsequent in time) of the second point equal to a percentage of the interval from the second point to a predetermined amount behind the second R wave. The highest peak in the second interval is the V wave.\nAfter identifying the V wave, a third interval in the atrial pressure waveform is established. The third interval extends from the highest peak in the first interval to a fourth point. The fourth point is positioned at a distance ahead of the highest peak in the first interval (the A wave) equal to a second percentage of the distance between the highest peak in the first interval and the highest peak in the second interval (the V wave). The C wave is the highest peak in the third interval. This same process may be repeated across the different R waves in the ECG to determine the A, C, and V waves for different xe2x80x9cheart beatsxe2x80x9d or cardiac cycles.\nAs is apparent from the above, it is an advantage of the present invention to provide a method of determining or identifying the A, C, and V waves in a waveform. Other features and advantages of the present invention will become apparent by consideration of the detailed description and accompanying drawings. The present invention relates generally to a client and server system for presenting multimedia information and, more particularly, to an integrated internet on-demand"} +{"output_text": " in the image forming apparatus, and a developing device which is mounted in the image forming apparatus.\n2. Description of the Related Art\nIn an image forming apparatus such as a laser printer or a copying machine, a developing device is mounted in an apparatus body. The developing device includes a developing roller which is arranged in the developing device and which is configured to supply developer to a photosensitive drum. The developing roller is configured to be rotated by a driving force of a motor. The developing roller is provided", "input_text": " the eye. A Schirmer's test can measure the amount of moisture bathing the eye. This test is useful for determining the severity of the condition.\nA variety of approaches can be taken to treatment, such as: avoidance of exacerbating factors, tear stimulation and supplementation, increasing tear retention, and eyelid cleansing and treatment of eye inflammation.\nFor mild and moderate cases, supplemental lubrication is the most important part of treatment. Application of artificial tears every few hours can provide temporary relief.\nLubricating tear ointments can be used during the day, but they generally are used at bedtime due to poor vision after application. They contain white petrolatum, mineral oil, and similar lubricants. They serve as a lubricant and an emollient. Depending on the severity of the condition, ointments may be applied from every hour to just at bedtime. Ointments should not be used with contact lenses. Inflammation occurring in response to tears film hypertonicity can be suppressed by mild topical steroids or with topical immunosuppressants such as cyclosporine.\nTopical 0.05% cyclosporine A, as a castor oil-based ophthalmic emulsion, is marketed in the United States by Allergan under the trade mark RESTASIS\u00ae. RESTASIS\u00ae decreases surface inflammation of the eye. It is thought to work through inhibition of transcription factors required for cytokine production and T-lymphocyte maturation. In a trial involving 1200 people, RESTASIS\u00ae increased tear production in 15% of people, compared to 5% with placebo. Usually, 1 drop of RESTASIS\u00ae is instilled in each eye twice a day, 12 hours apart. 1. Field of the Invention\nThe present invention relates to a process cartridge which is mounted in an image forming apparatus such as a laser printer or a copying machine, an end member which is arranged"} +{"output_text": ". No. 4,210,972; U.S. Pat. No. 4,218,742; U.S. Pat. No. 4,224,948; U.S. Pat. No. 4,226,848; U.S. Pat. No. 4,230,685; U.S. Pat. No. 4,236,525; U.S. Pat. No. 4,238,812; U.S", "input_text": " Pat. No. 3,606,402; U.S. Pat. No. 3,692,601; U.S. Pat. No. 3,700,519; U.S. Pat. No. 3,701,489; U.S. Pat. No. 3,734,421; U.S. Pat. No. 3,738,637; U.S. Pat. No. 3,740,285; U.S. Pat. No. 3,769,127; U.S. Pat. No. 3,783,060; U.S. Pat. No. 3,828,112; U.S. Pat. No. 3,856,052; U.S. Pat. No. 3,856,052; U.S. Pat. No. 3,860,742; U.S. Pat. No. 3,933,180; U.S. Pat. No. 3,956,051; U.S. Pat. No. 3,957,410; U.S. Pat. No. 3,960,629; U.S. RE29,122; U.S. Pat. No. 4,053,343; U.S. Pat. No. 4,057,610; U.S. Pat. No. 4,095,865; U.S. Pat. No. 4,108,701; U.S. Pat. No. 4,125,423; U.S. Pat. No. 4,133,972; U.S. Pat. No. 4,137,949; U.S. Pat. No. 4,139,025; U.S. Pat. No. 4,190,088; U.S. Pat"} +{"output_text": " the range of about 9 to about 12.5.\nThe oxygen delignification stage is performed at a temperature of between about 50 and about 100xc2x0 C. and preferably between about 60 and about 90xc2x0 C. The oxygen delignification stage is performed at a pressure of between about 0.1 and about 5 MPa and preferably between about 0.5 and about 3 MPa.\nThe oxygen delignification stage is performed in a reactor vessel equipped with", "input_text": "/ton of material and more preferably between 50 and 300 kWh/ton.\nc) Oxygen Deligniflcation\nOxygen delignification and bleaching with oxygen-based molecules have become increasingly popular in conjunction with the manufacturing of kraft pulp and the cost of oxygen chemicals has come down significantly. The oxygen delignification stage of the present invention, following the pretreatment, is performed in one or preferably two or more stages.\nIn analogy with the precooking step discussed above, an alkaline buffer is also present during oxygen delignification. The alkaline buffer agent may contain alkali metal carbonate or bicarbonate. Other buffering agents can be employed such as alkali metal phosphates and alkali metal boron compounds. The most preferred buffer solution comprises sodium carbonate, sodium bicarbonate or sodium borate\"\"s or mixtures of these compounds. The alkaline buffer solution originates in the chemicals recovery system of the present invention from where it is recycled for use in the oxygen delignification stage without having been subjected to causticizing reactions with lime.\nThe alkaline buffer can be supplied to the oxygen delignification stage as such, but it is also possible to add alkali metal hydroxides to increase the alkalinity of the buffer solution. When carbonate or bicarbonate is used as a buffer component, carbon dioxide may be liberated during oxygen delignification and gases may have to be vented from the reactor vessel continuously or from time to time. A high partial pressure. of carbon dioxide retards the delignificaUon, and uncontrolled variations in the carbon dioxide content of the pulping liquor make control of the oxygen delignification process difficult.\nWhether alkali bicarbonate, carbonate, or borates, or a mixture thereof is used, it is suitable to add the alkaline buffer solution incrementally during oxygen delignification. Ultimately, the addition is controlled to maintain the pH within"} +{"output_text": ". The shroud is preferably made of a non-magnetic material, such as stainless steel, and the electromagnet is preferably made of a magnetic material, such as iron. The shroud may be made of a non-magnetic material, such as stainless steel, and the electromagnet may be made of a magnetic material, such as iron. The shroud may be made of a non-magnetic material, such as stainless steel, and the electromagnet may be made of a magnetic material, such", "input_text": " finally, prior art sensors, make relatively inefficient use of the magnetic field and generally require substantially more electrical operating power than do other types of probe sensors.\nIt is therefore an object to provide a magnetic flow sensing probe which offers significant improvement over the prior art just described.\nIt is a further object to adapt the improvements of the probe sensor configuration to the inline sensor configuration.\nThe above and other objects of the invention are attained by magnetic flow sensors in accordance with various preferred embodiments of the present invention. In preferred embodiments, the magnetic axis (i.e., the line extending from the south to the north pole) of an electromagnet is oriented generally perpendicular to a direction of flow of a conductive liquid. As is known in the magnetic flow metering art, the flux from a magnet arranged in this fashion generates in the liquid a voltage difference proportional to the flow rate of the liquid. In various embodiments of this invention this voltage difference is sensed by a sensing head comprising, in addition to the magnet, a pair of electrodes (which preferably have the same size and shape and are made of the same material) spaced apart from each other along a separation line that is generally orthogonal to both a direction of flow and the magnetic axis. These electrodes may be located in a circular shroud.\nThe voltage indicative of flow rate is measured between the corresponding two electrodes of a pair when the associated magnetic flux is present and stable, as is known in the magnetic flow metering art. This may consist of a cyclic processing procedure including the measurement and storage of first a first electrode difference potential when no magnetic field is present, followed by a similar measurement with the field present. In this arrangement, the difference between the two measurements is representative of the liquid flow rate.\nIn a preferred embodiment of the present invention the flow passage is defined by a cylindrical shroud and the electromagnet is located close to the region of fluid flow being sensed"} +{"output_text": ". Neurol. 55:1033-1039 [1998]; and Borasio et al., Ann. Neurol. 46:8-15 [1999]).\nThe present invention provides a novel method for treating a disease or condition associated with the presence of a mutant or aberrant protein. The method comprises administering to a subject in need thereof an effective amount of a compound of the formula: \nwherein:\nR1 is selected from the group consisting of hydrogen, alkyl, alkenyl", "input_text": " disorder resulting from a CAG/polyglutamine repeat expansion in the gene encoding this disease, ultimately resulting in the death of striatal neurons. The polyglutamine expansion results in the formation of insoluble, high molecular weight protein aggregates similar to those seen in Alzheimer's disease (Scherzinger et al., Cell 90:549-558 [1997]). Postmortem examination of the brains of patients suffering from Huntington's disease revealed that CAG repeat length positively correlates with the degree of DNA fragmentation within the afflicted striatum (Butterworth et al., Neurosci., 87:49-53 [1998]), indicating that neuronal degeneration observed in Huntington's disease may also occur through an apoptotic process.\nAmyotrophic lateral sclerosis (ALS) is caused by a progressive degeneration of spinal cord motor neurons and results in complete paralysis, respiratory depression and death. Aggregates of ubiquitinated proteins have been observed in ALS (Kato et al., Histol. Histopathol., 14:973-989 [1999]). Recent experiments suggest that death of motor neurons in ALS may have an apoptotic component (Pasinelli et al., Proc. Natl. Acad. Sci. USA 95:15763-15768 [1998]; and Martin, J. Neuropathol. Exp. Neurol., 58:459-471 [1999]).\nCurrently used therapies for Alzheimer's disease, Huntington's disease and amyotrophic lateral sclerosis suffer the same limitations associated with Parkinson's disease therapies described above (See e.g., Sramek et al., Drugs & Aging 14:359-373 [1999]; Mayeux and Sano, N. Eng. J. Med., 341:1670-1679 [1999]; Eisen and Weber, Drugs & Aging, 14:173-196 [1999]; Borasio et al., Neurology 51:583-586 [1998]; Riviere et al., Arch"} +{"output_text": " the switch 175 to provide the Master Timing Reference (MTR) signal. The MTR signal is fed to a payload clock distribution unit (PCDU) 160. The PCDU 160 is adapted to distribute the MTR signal to up to four identical outputs 162, 164, 166, 168. The output signal of the second synthesizer 145 provides a hot redundant alternative for the MTR signal.\nThe master clock generation unit 100 is a complex and expensive unit. The switching matrix 130 is a", "input_text": " synthesizer outputs is selected to provide a master clock for a payload and is distributed to up to four identical outputs. The output signal of the second synthesizer provides a hot redundant alternative for the master clock. A phase meter monitors the output phase of the active synthesizer against the hot redundant one.\nAccordingly, the clock monitoring and control unit being part of the payload for satellite navigation systems generates a satellite's Master Timing Reference (MTR) signal based on input signals provided by atomic standards. The functional concept of a known master clock generation unit is shown in FIG. 1. The master clock generation unit or CMCU 100 derives an output reference frequency, namely a 10.23 MHz on-board Master Timing Reference (MTR), based on a set of four atomic frequency standards which are fed to frequency inputs 102, 104, 106 and 108. Each of the frequency inputs 102, 104, 106, 108 is connected to a respective matrix input 131, 132, 133, 134 of a 4\u00d72 switching matrix 130. The switching matrix 130 enables selecting a nominal (primary) and a redundant (secondary) clock at a first and a second matrix output 135, 136. The switching matrix 130 is telecommanded via a controller 180. The nominal and the redundant clock at the first and the second matrix output 135, 136 are fed to a first and a second frequency synthesizer 140, 145. The frequency synthesizers 140, 145 are adapted to perform a frequency conversion according to different clock types: a Passive Hydrogen Maser (PHM) and a Rubidium Clock (RAFS). The respective synthesizer outputs 142, 147 are connected to a phase meter 170 and a switch 175. The phase meter 170 monitors the phase difference between the output signals of the frequency synthesizers 140, 145 and stores the results for later retrieval. One of the two synthesizers 140, 145 output signals is selected by"} +{"output_text": "ligomer content is calculated by subtracting the theoretical acid number from the acid number determined by chemical analysis.\nThe carboxyl-containing monomers can be made by reacting a carboxyl-containing monomer with a hydroxyl-containing monomer in the presence of a free radical initiator. The carboxyl-containing monomer can be any carboxyl-containing monomer which is capable of reacting with the hydroxyl-containing monomer to form a prepolymer. The carboxyl-containing monomer can be any carboxyl-containing monomer which", "input_text": " preferably from about 3,000 to about 50,000 cps, and most preferably from about 3,000 to about 20,000 cps. If viscosity of prepolymer is more than 100,000 cps, the prepolymer usually is too thick for high speed mixing and no good waterborne dispersion can be obtained.\nIn order to increase the shelf life of prepolymer products made from the carboxyl-containing monomers, it is desirable that the carboxyl-containing monomers made as described above contain minimal amounts of oligomers. As defined herein, oligomers are molecules which result from the reaction of the grafted carboxyl function with another hydroxyl function, which can lead to oligomerization of the monomer products. Oligomers are undesirable due to their propensity to cause increased viscosity of the monomer product.\nIt has been found that the presence of oligomers above about 30 mg KOH/g (as analyzed below) results in undesirable gelling of the prepolymer product. Preferably, the carboxyl-containing monomers have less than 30 mg KOH/g oligomers, preferably between 2 and 30 mg KOH/g oligomers, more preferably between 2 and 20 mg KOH/g oligomers, and most preferably between about 2 and 15 mg KOH/g oligomers. Oligomer content in the carboxyl-containing monomer can be measured by calculating the difference between theoretical acid number and acid number determined by chemical analysis as known in the art.\nBriefly, acid number is determined using 1-2 grams of sample. 100 ml of isopropyl alcohol and 50 ml water is added to the sample, and stirred until the sample is completely dissolved. Approximately 15 drops of 1% phenolphtalein solution is added, and the sample solution is titrated with 0.5 N potassium hydroxide (or 0.5 N sodium hydroxide) until a light pink color appears. O"} +{"output_text": ", and the InGaAsP light absorption layer 177. The electrons and holes generated in the InGaAs light absorption layer 177 are accelerated by the electric field reduction layer 175 and the electric field reduction layer 175, and then are collected by the anode electrode 171.\nThe multiplication region 181 is provided to increase the multiplication gain of the avalanche photodiode (APD). The multiplication region 181 is a p-type region having a high carrier concentration. The multiplication region 181 is provided to increase", "input_text": " wavelength light 162 is provided in front of the avalanche photodiode 161, acting as a photodetector device, to selectively receive 1.3 \u03bcm wavelength light 164.\nFIG. 12 is a cross-sectional view of a conventional avalanche photodiode (hereinafter referred to as a \u201cconventional APD\u201d) for optical communications. Referring to the figure, reference numeral 171 denotes an anode electrode; 172 denotes a p-type diffusion layer region; 173, a nonreflective film; 174, an undoped InP window layer; 175, an n-type InP electric field reduction layer; 176, an undoped InGaAsP graded layer; 177, an undoped InGaAs light absorption layer; 178, an n-type InP substrate; 179, a cathode electrode; 180, an anode electrode; 181, a multiplication region; and 182, a guard ring region.\nThe nonreflective film 173 and the InP window layer 174 also act as a surface protective film and a multiplication layer, respectively. It should be noted that the InP window layer 174 has a large bandgap and hence does not absorb the wavelengths used in typical optical communications, such as 1.3 \u03bcm and 1.55 \u03bcm, allowing these wavelengths to pass without change. The guard ring region 181 is provided to prevent edge multiplication and is a p-type region having a low carrier concentration.\nLight entering the nonreflective film 173, as shown at the top of the FIG. 12, is passed through the InP window layer 174 and then absorbed by the InGaAs light absorption layer 177, generating electrons and holes. It should be noted that the avalanche photodiode (APD) is reverse-biased with a high voltage (approximately 25 V), which depletes the InGaAs light absorption layer 177, the InGaAsP graded layer 176"} +{"output_text": " by the bar code production apparatus, and discount money amount inputting means for inputting a discount amount of money for the commodity whose bar code has been read by the POS bar code reading means, and settlement processing means for performing settlement processing based on the read information of the bar code discriminated by the bar code discrimination means and the discount money amount inputted by the discount money amount inputting means.\nAccording to the present invention, there is also provided a POS system, comprising a bar code production apparatus", "input_text": "-discount data read by means of a scanner is applied to the commodity, the pre-discount data is not known to the POS system. Consequently, there is a problem that the customer cannot know by what amount of money the commodity is discounted.\nAccording to the fourth form, various information can be written using a multi-dimensional bar code. However, it is not popular to use a multi-dimensional bar code with a POS system, and it is necessary to construct unique hardware. Thus, the system of the fourth form is low in universal use.\nIt is an object of the present invention to provide a POS system which does not make an error in discounting with a simple construction and is high in credit of a customer.\nIn order to attain the object described above, according to the present invention, there is provided a POS system, comprising a bar code production apparatus for producing a bar code which includes discount information of a commodity, and a POS apparatus for reading information of the bar code produced by the bar code production apparatus and performing settlement processing based on the read information, the bar code production apparatus including bar code reading means for reading a bar code of a commodity before the commodity is discounted, discount money amount inputting means for inputting a discount amount of money for the commodity whose bar code has been read by the bar code reading means, bar code production means for producing a bar code which includes the pre-discount commodity information and the discount money amount information based on the bar code read by the bar code reading means and the discount money amount inputted by the discount money amount inputting means, and bar code outputting means for outputting the bar code produced by the bar code production means, the POS apparatus including POS bar code reading means for reading a bar code, bar code discrimination means for discriminating whether or not the bar code read by the POS bar code reading means has been produced"} +{"output_text": " surface of a substrate, a sensor 4 installed at a lower side of the bubble plate 3 to sense the supply flow of the etchant, and a controller 5 controlling the supply flow of the etchant according to the sensing result of the sensor 4.\nThe etching apparatus according to the related art is explained in detail as follows.\nThe etching bath 1 is filled with the etchant. The bubble plate 3 is installed at a lower side of the etching bath 1 to generate bubbles by a gas or", "input_text": " glass substrate should be decreased. Yet, a physical force is occasionally applied to the glass substrate in the process of fabricating the liquid crystal display device. And, the glass substrate undergoes a number of heating and cooling processes. Hence, the thin glass becomes easy to be broken. Recently, used is a new method including the steps of using a thick glass substrate in the early stage of process and thinning the glass substrate in the later process. Namely, devices and color filters are formed on thick glass substrates to prepare upper and lower glass substrates, the upper and lower glass substrates are bonded to each other, and then outer surfaces of the glass substrates are etched to reduce an overall thickness of the liquid crystal display device.\nGenerally, the glass substrate is etched by wet etching carried out in a manner that the glass substrate is dipped in a bath filled with an etchant of strong acid etching a surface of the glass substrate.\nHowever, such a method of wet etching makes the uneven surface of the substrate since particles generated from the etching process sticks to the substrate. Moreover, if the supply of the etchant fails to be controlled, the glass substrate is etched in part to generate the failure caused by the difference between the etched and non-etched portions.\nIn order to overcome such problems, a supply flow of an etchant is controlled using an etching apparatus equipped with a sensor enabling to sense the supply flow of the etchant.\nAn etching apparatus according to a related art is explained by referring to the attached drawings as follows.\nFIG. 1 illustrates a schematic cross-sectional view of an etching apparatus according to a related art.\nReferring to FIG. 1, an etching apparatus according to a related art includes an etching bath 1 having an etchant, a bubble plate 3 installed at a lower side inside the etching bath 1 to generate bubbles by a gas or an air supplied from outside to remove particles on a"} +{"output_text": " to use, and provides immediate feedback regarding the golf club stroke. The present invention fulfills these needs and provides further related advantages as described in the following summary.\nThe present invention provides a golf tee that provides immediate feedback regarding the golf club stroke. The golf tee includes a shaft having a first end and a second end, a handle secured to the first end of the shaft, and a ball support secured to the second end of the shaft. The ball support includes a ball support body having a first", "input_text": " housing and spaced apart so that the wear rate of the housing in the misch metal barrels is compatible and a desired sparking effect is achieved. The spark-emitting device for a skateboard is not utilized for golf purposes; no disclosure is contained therein regarding a spark-emitting tee for golf play.\nU.S. Patent Application Publication No. 20090143159 to Murph et al. discloses a golf club that provides a universal training tool for golfers of all sizes. The golf club includes an adjustable length shaft having a club head secured at one end thereof and a handle secured at the other end thereof. A sensor circuit disposed in the club head includes a first sensor adapted to generate and transmit a first measurement signal representing a first desired characteristic of the golf club, and a display circuit disposed in the handle.\nU.S. Patent Application Publication No. 20130165273 to Delisle et al. discloses a golf tee including an elongate shaft having opposed upper and lower ends, the lower end configured to be inserted into an underlying surface; and a support cup that is configured to support a golf ball from beneath and that merges with the shaft. The support cup has a base portion and further includes at least three arcuate support prongs projecting upwardly from the base portion. The support prongs define a discontinuous annulus about the periphery of the support cup. There is no disclosure of a spark induction coating on the golf tee. Inasmuch as no spark is generated, the golf tee does not provide immediate feedback regarding the golf club stroke.\nForeign Patent Publication No. WO/2011/078469 to Ru discloses a golf tee that prevents the golfer from raising his head up. The golf tee construction comprising inter alia a light emitting lamp built in a laid portion.\nNone of the heretofore disclosed and/or utilized devices or methods provides a training aid that is economical to produce, easy"} +{"output_text": " warning devices are typically used to warn of the presence of a mobile station in a particular location. For example, a warning device may be used to warn of the presence of a mobile station in a hospital or other sensitive area. However, such warning devices are not always effective. For example, a mobile station may be located in a location where the mobile station is not transmitting RF signals. As a result, the warning device may not detect the presence of the mobile station.\nIn addition to the above", "input_text": " parameters, such as variations in A-A, R-R or AV intervals, may be additionally used to aid in tracking respiration but are not required.\nThe parent application also presented techniques for detecting episodes of abnormal respiration based on respiration patterns, such as episodes of such as apnea, hypopnea, nocturnal asthma, or CSR. The present application is primarily directed to providing further improvements in the area of abnormal respiration detection. This invention relates to mobile station communications and, more particularly to location dependent behavior in a mobile station.\nAn increasingly common problem faced by mobile station users is that of prohibition of use. Reasons for such prohibitions vary, generally according to location. For example, operation of an electronic device, such as a mobile station, during the take-off and landing operations of an aircraft can interfere with electronic signals which are critical to the operation of the aircraft. As a result, the FAA prohibits the operation of certain electronic devices by passengers during take-off and landing operations. However, this prohibition does not ensure that deliberate violators and careless passengers will not operate their electronic devices during such critical periods. A means of addressing this specific hazard is disclosed in U.S. Pat. No. 5,815,407, entitled xe2x80x9cMethod and Apparatus for Inhibiting the Operation of an Electronic Device During Take-Offs and Landings of an Aircraft.xe2x80x9d Prohibitions on mobile station use due to critical operations can also occur in other environments. Such environments can include, for example, hospitals and other areas where sensitive medical instruments may need to be protected from possible radio frequency interference (RFI) caused by mobile stations.\nIn addition to prohibitions on the use of mobile stations and other electronic devices, warning devices which detect the radio frequency (RF) transmission of a mobile station may also be used. Such"} +{"output_text": " two cycles and a second refresh release command executed during a second cycle of the at least two cycles.\nAccording to another aspect of the embodiments, the first refresh release command may be executed in response to a first clock enable signal and the second refresh release command may be executed in response to a second clock enable signal.\nAccording to another aspect of the embodiments, the first clock enable signal may be enabled during a first cycle of the at least two cycles and the second clock enable signal may be enabled during", "input_text": "i-10-125059, the setting and exiting (releasing) of the self refresh mode both executed with clock enable signal CKE in synchronism with external clock signal ECK. Accordingly, it is necessary to keep the initial stage circuit that generates internal clock ICK enabled during the operation of the self refresh mode. Thus, power consumption during the operation of the self refresh mode may be problematic.\nIn light of the above discussion, it would be desirable to provide a semiconductor memory device that may include a self refresh mode that may not be erroneously exited (released) due to the influence of noise or the like. It would also be desirable to provide the self refresh mode having a reduced power consumption.\nA semiconductor memory device having a self refresh mode is disclosed. The self refresh mode may be entered in response to a self refresh set command and may be released in response to a self refresh release command. The self refresh release command may include a plurality of self refresh release commands sequentially executed while a clock enable signal is in a clock enable state. In this way, noise on a clock enable signal may not erroneously release the self refresh mode and the reliability of the self refresh mode may be improved.\nAccording to one aspect of the embodiments, a semiconductor memory device may include a refresh operation entered upon the receipt of a refresh set command. The refresh operation may be released upon the receipt of a refresh release command wherein the release command may be executed over at least two cycles.\nAccording to another aspect of the embodiments, the semiconductor memory device may be a synchronous dynamic random access memory and the at least two cycles may be two cycles of an external clock.\nAccording to another aspect of the embodiments, the at least two cycles may be consecutive cycles of the external clock.\nAccording to another aspect of the embodiments, the refresh release command includes a first refresh release command executed during a first cycle of the at least"} +{"output_text": "ization signals from the control console. For example, a femoral reamer handpiece is provided with a foot switch that is used to control the speed of the motor. The foot switch is connected to the control console by a cable that is terminated at the handpiece. The handpiece is provided with a connector that is adapted to receive the cable. The connector is provided with a set of pins that are arranged to mate with the connector on the console. The connector on the console is provided with a set", "input_text": " burr or a saw blade are attached. These tools are used to selectively remove small sections of hard or soft tissue or to separate sections of tissue. The ability to use powered surgical tools on a patient has lessened the physical strain of physicians and other personnel when performing surgical procedures on a patient. Moreover, most surgical procedures can be performed more quickly and more accurately with powered surgical tools than with the manual equivalents that proceeded them.\nA typical powered surgical tool system, in addition to the handpiece, includes a control console and a cable that connects the handpiece to the console. The control console contains the electronic circuitry that converts the available line voltage into energization voltage suitable for powering the motor integral with the handpiece. Typically, the control console is connected to receive a signal from the hand or foot switch used to control the tool; based on that signal, the console sends appropriate energization signals to the handpiece so as to cause it to operate at the desired speed.\nAs the use of powered surgical tools has expanded, so has the development of different kinds of powered surgical tools that perform different surgical tasks. For example, a femoral reamer, used in hip replacement surgery is a relatively slow speed drill that operates at approximately 100 RPM, yet it draws a relatively high amount of power, approximately 400 Watts. Neurosurgery requires the use of a craniotome which is a very high powered drill that operates at approximately 75,000 RPM and that requires a medium amount of power, approximately 150 Watts. In ear, nose and throat surgery, micro drills are often employed. A typical micro drill rotates between approximately 10,000 and 40,000 RPM and requires only a relatively small amount of power, approximately 40 Watts.\nAs the number of different types of powered surgical tools have expanded, it has become necessary to provide each type of handpiece a mechanism for ensuring that it receives the appropriate energ"} +{"output_text": ". Therefore, an arrangement is needed for a biological safety cabinet that eliminates the need for a perforated plate and instead provides for a uniform air flow across the supply filter.\nYet another drawback of existing prior art safety cabinets involves the design of the exhaust system. The exhaust system is located in the area above the blower and below the work area. More specifically, in prior art cabinets, air leaving the blower is directed to a perforated plate and then through a supply filter prior to being rec", "input_text": " cleaning it. Therefore, a towel catch that is readily accessible and can be visually inspected is needed.\nAnother drawback of prior art safety cabinets involves the construction of the sash. The sash of the safety cabinet is moveable upwardly and downwardly, to allow better access to the working environment when needed and to more fully enclose the working environment when access is no longer needed. In prior art safety cabinets, the rear of the sash is provided with a seal to prevent any contaminated air from escaping the working environment. The seal wipes the back of the sash as the sash is raised. This arrangement is disadvantageous in that the wiping action may create an aerosol containing contaminants from the rear of the sash. While in other prior art constructions holes communicating with the exhaust system have been utilized in place of seals, such constructions have not been particularly effective, largely because there has been no means for insuring a uniform negative pressure across the exhaust holes. Thus, an arrangement is needed for a biological safety cabinet that eliminates the need for a wiping seal at the rear of the sash and instead provides for a uniform negative pressure which will insure removal of any contaminated air from the back side of the sash.\nYet another drawback of existing prior art safety cabinets involves the design of the positive pressure plenum box. This box is located in the area below the blower and above the work area. More specifically, in prior art cabinets, air leaving the blower is directed to a perforated plate and then through a supply filter prior to be recirculated downwardly through the work area. The perforated plate is used to more evenly distribute the air flow over and through the supply filter. The perforated plate creates an undesirable increased load on the blower and can interfere with the function of the supply filter. Moreover, this prior art construction does not distribute air across the supply filter as evenly as desired"} +{"output_text": " \u201csoftware pirates\u201d who are not interested in the software they copy, but only in the money they can make from it.\nCopy-encouragement\nThe shareware and small-scale marketers who tolerate low registration rates in order to reach the many potential users who can be reached at little cost through non-traditional distribution channels are not interested in the software they copy, but only in the money they can make from it. They are not interested in the software itself, but only in the", "input_text": " apparatus, such as a computer or a digital audio tape player.\nThe Copyability of Software\u2014Problem and Opportunity\nDigitally encoded information (\u201csoftware\u201d) is one of the most economically important commodities of the era. The ease and economy with which perfect copies can be made, copied and distributed has promoted the spread of software and related technologies through \u201ctraditional\u201d commercial channels (retail and mail order sales, etc.) and through \u201cnon-traditional\u201d distribution channels: computer user groups, user-to-user copying and sharing (e.g., of software and of music and video tapes), digital data networks such as the internet, Compuserve, static media such as CD-ROM disks loaded with large quantities of data, public libraries, and broadcast media. These non-traditional distribution channels in particular have made it difficult for software creators and copyright holders to regulate the use of their creations, or to receive payment and registration information from their users. Consequently, software producers forfeit substantial revenues and valuable information about their customer base and potential markets, while businesses and universities find themselves subject to legal prosecution and intimidation for software piracy.\nTwo approaches to these problems are copy-deterrence, and copy-encouragement. Copy-deterrence is implemented through laws, license agreements and copy-protection technologies. Copy-encouragement is practiced by \u201cshareware\u201d and small scale marketers who tolerate the low registration rates in order to reach the many potential users who can be reached at little cost through non-traditional distribution channels. Separately and in combination, however, these approaches have had significant disadvantages.\nCopy-deterrence\nLegal copy-deterrence techniques such as licensing agreements, and litigation against companies and universities whose members knowingly or unknowingly engage in piracy are inefficient, expensive, and often unsuccessful. They incidentally create large numbers of \u201csoftware criminals\u201d or"} +{"output_text": " quantity. For example, a photocopier may be provided with a photocopier cartridge that includes a photocopier imaging media quantity detector. The photocopier cartridge may include a photocopier imaging media quantity detector that is configured to detect the quantity of imaging media in the cartridge. The photocopier cartridge may also include a photocopier imaging media quantity indicator that is configured to indicate to the user when the cartridge is low on imaging media. The photocopier imaging media quantity detector may", "input_text": ", and removed from, the imaging apparatus. The cartridge is typically designed to prevent leakage of the imaging media from the cartridge when the cartridge is handled by a user or installed in the device, but is also designed to allow the imaging apparatus to selectively remove the imaging media from the cartridge during an imaging process.\nBy xe2x80x9cimaging apparatusxe2x80x9d we mean any apparatus configured to use imaging media to generate an image on sheet media, such as on paper or a transparency. Examples of imaging apparatus include (without limitation) printers, photocopies, facsimile machines, plotters, and combinations thereof (i.e., imaging apparatus commonly known as xe2x80x9call-in-onexe2x80x9d imaging apparatus or xe2x80x9cmultifunction peripheralsxe2x80x9d). Example of imaging processes that can be used by imaging apparatus include electrophotographic imaging, including laser printing, and ink printing, including ink jet printing. Two primary types of imaging media are provided to imaging apparatus via a cartridge. These primary types of imaging media include wet ink and dry toner. Dry toner (xe2x80x9ctonerxe2x80x9d) is commonly provided as powdered carbon black or very small particles of plastic (as in the case of non-black toners).\nWhen the imaging media within a cartridge becomes depleted, the user typically replaces the spent cartridge with a replacement cartridge that contains additional imaging media. The user may not always have a replacement cartridge on hand, or the replacement cartridge may not be easily accessible. Accordingly, a user may be put in the position of not being able to complete an imaging job due to a lack of imaging media.\nSome imaging apparatus are provided with imaging media quantity detectors which allow a user to have advance notice of a low imaging media"} +{"output_text": " displays have a third resolution limiter. The phosphor is driven continuously, but the electron beam is scanned in a raster pattern. This is because the electron beam is not deflected to the next phosphor location until the beam is back at the starting point. This means that the electron beam must be scanned back and forth across the entire display. This is a very slow process and is the reason that the electron beam is not deflected to the next phosphor location until the beam is back at the starting point", "input_text": " drive from the electron beam. As the drive increases, so does the brightness. Unfortunately, the shadow mask is also sensitive to the electron beam and will thermally distort under high drive. The image is then blurred both by the shadow mask becoming more visible and by the electron beam being deflected toward and unwanted phosphor.\nThe second resolution limiter is rastering. All pixels to be illuminated are sequentially scanned by an electrom beam. This beam is swept in a raster back and forth acros the phophors. In general, the beam is turned off when tracing back across the phosphors (known as the retrace time) and is also turned off when returning to the starting point (vertical blanking interval). While this is not a theoretical limitation (all phosphor points can be accessed), it is a practical limitation. This is because the fluorescence of the phosphors begin decaying as soon as the electron beam moves to the next location. The electron beam must return before the human eye can perceive the decay or else the display will flicker. Longer persistence phosphors can be used to compensate, but they suffer from a smear effect when the display data changes.\nRastering has another insidious side-effect. It places an upper limit on the perceived brightness of a display. As discussed above, a phosphor can only be driven for a very short period of time, and will then start to decay. If the phosphor is driven hard, then it will start to bloom (i.e. it will start to excite neighboring pixel locations) and blur the display. If the phosphor was continously excited for an extended time, it would appear to be brighter than it if was excited only for the raster period. This is because the human eye has an integration time of approximately 0.1 seconds for bright sources of light and approximately 0.2 seconds for dimmer sources.\nProjection CRT based"} +{"output_text": " a photomask for a light exposure process, and the light-semitransmissive film is formed on a substrate having a light-transmissive portion. The light-transmissive portion is formed with a light-transmissive film having a light-transmissive portion and a light-semitransmissive portion. The light-transmissive portion is formed with a light-transmissive film having a light-transmissive portion and a light-semitransmissive portion. The light-trans", "input_text": " of the light-shielding film is reduced, the OD (optical density) value is also reduced. In the case of a chromium-based light-shielding film, the total thickness of about 60 nm is minimally required for achieving OD=3 which is generally required, and therefore, a large reduction in the thickness of the film is difficult to achieve (see, e.g. JP-A-2007-241136 (Patent Document 1), paragraph [0005]).\nJP-A-2009-230112 (Patent Document 2) discloses a binary mask blank comprising a light-shielding film having a laminated structure of tantalum-based materials, such as a light-shielding film having a laminated structure of a TaN layer and a TaO layer from the substrate side. Since a tantalum-based material has a higher light-shielding performance than that of a chromium-based material, even if the total thickness of the film is less than 60 nm, it is possible to achieve OD=3 which is generally required.\nOn the other hand, WO2005/124454 (Patent Document 3) discloses a mask blank comprising a light-semitransmissive film. This light-semitransmissive film has a property of transmitting exposure light at a predetermined transmittance and this property is substantially the same as that of a conventional halftone phase shift film. However, this light-semitransmissive film also has a property such that the phase difference between exposure light transmitted through a light-semitransmissive portion formed with the light-semitransmissive film and exposure light transmitted through a light-transmissive portion formed with no light-semitransmissive film is small. This property is totally different from that of the conventional halftone phase shift film. The mask blank comprising this light-semitransmissive film is used for"} +{"output_text": " Y and Z gradient coils) into two or more sections and combining the sections into a single unit. The RF coil is then combined with the main gradient coil unit.\nIn a typical MRI system, the gradient coil assembly is cooled by a cryocooler. The cryocooler is typically a closed-cycle cryocooler that includes a compressor, a condenser, an expansion valve and an evaporator. The compressor is typically a scroll compressor that includes a stationary scroll and a movable", "input_text": " coil assembly is disposed around the RF body coil assembly in a spaced-apart coaxial relationship and the gradient coil assembly circumferentially surrounds the RF body coil assembly. The gradient coil assembly is mounted inside the superconducting magnet and circumferentially surrounded by the superconducting magnet. Interconnections for supply and return of electricity, control signals, coolant and the like are typically routed from a \u201cservice end\u201d of the MRI scanner around the cylindrical magnet assembly, while a patient table and other patient-directed aspects are placed at another end, the \u201cpatient end,\u201d of the MRI scanner.\nThe gradient coil assembly used in an MRI system may be a shielded gradient coil assembly that consists of inner and outer gradient coil assemblies bonded together with a material such as epoxy resin. The inner gradient coil assembly or winding and the outer gradient coil assembly or winding are disposed in concentric arrangement with respect to a common axis. Typically, the inner gradient coil assembly includes inner (or main) X-, Y- and Z-gradient coils and the outer gradient coil assembly includes the respective outer (or shielding) X-, Y- and Z-gradient coils. In order to improve gradient coil performance as well as to reduce the radial space used in the magnet assembly, combined (or integrated) gradient coil/RF coil designs have been developed (for example, as described in U.S. Pat. No. 6,930,482, entitled \u201cTime-Variable Magnetic Fields Generator For A Magnetic Resonance Apparatus,\u201d issued on Aug. 16, 2005, naming Oliver Heid and Markus Vester as inventors). Such designs allow the main gradient coils to be brought closer radially to the imaging region, which can improve gradient performance. In an integrated gradient coil/RF coil configuration, a main (or inner) gradient coil assembly and RF coil are combined into a single unit by splitting the main gradient coil (i.e., the X,"} +{"output_text": " and fluidity.\nHowever, the nonmagnetic one-component polymerization and pulverization-type toner has a problem in that the CCA is not uniformly dispersed in the binder resin, and thus the CCA is not uniformly distributed in the toner particles. As a result, the CCA is not uniformly distributed in the toner particles, and thus the CCA is not uniformly distributed in the toner particles. As a result, the CCA is not uniformly distributed in the toner particles, and thus the CCA is not uniformly", "input_text": ") and Q/M (\u03bcC/g) of the toner 30 are regulated by the toner layer regulation unit 20. Here, M/A (mg/cm2) is the weight of the toner 30 per unit area measured on the developing roller 16 after going through the toner layer regulation unit 20, and Q/M (\u03bcC/g) is an amount of charge of the toner 30 per unit weight measured on the developing roller 16 after going through the toner layer regulation unit 20.\nAs described above, the toner 30, which is charged with a predetermined charge and in which M/A (mg/cm2) and Q/M (\u03bcC/g) are regulated, moves to the surface of the photosensitive medium 10 using the developing roller 16 that is spaced a predetermined gap apart from the photosensitive medium 10 and rotated. In this case, the movement of the toner 30 is performed by a potential difference between the developing roller 16 and the electrostatic latent image formed on the surface of the photosensitive medium 10. The toner 30 that moves to the surface of the photosensitive medium 10 is attached to the electrostatic latent image. As such, the electrostatic latent image is developed as a desired image.\nThe image developed on the surface of the photosensitive medium 10 is transferred onto a sheet of paper by a transfer roller (not shown), and then is fused on the sheet of paper by a fusing unit (not shown). Toner remaining on the surface of the photosensitive medium 10 after the image is transferred onto the sheet of paper is removed by a cleaning blade 22 and is stored in a waste toner storage space 34.\nNonmagnetic one-component polymerization and pulverization-type toner used in the above-described conventional noncontact-type developing method includes toner particles in which a coloring agent, a charging control agent (CCA), and a release agent are added uniformly into a binder resin to improve chromaticity"} +{"output_text": "4)2), and potassium chloride (KCl). The sulfuric acid acts as a catalyst to promote the deposition of copper. The copper sulfate acts as a source of copper ions, and the potassium chloride acts as a source of potassium ions. The copper ions are reduced to copper metal at the cathode, and the copper metal is deposited on the seed layer. The substrate holder is then removed from the plating solution, and the deposited copper is then removed from the substrate using a chemical mechanical polishing (C", "input_text": " a semiconductor substrate or a glass panel such as one used in flat panel display manufacturing, plasma is often employed. As part of the processing of a substrate for example, the substrate is divided into a plurality of dies, or rectangular areas, each of which will become an integrated circuit. The substrate is then processed in a. series of steps in which materials are selectively removed (etching) and deposited. Control of the transistor gate critical dimension (CD) on the order of a few nanometers is a top priority, as each nanometer deviation from the target gate length may translate directly into the operational speed of these devices.\nAreas of the hardened emulsion are then selectively removed, causing components of the underlying layer to become exposed. The substrate is then placed in a plasma processing chamber on a substrate support structure comprising a mono-polar or bi-polar electrode, called a chuck or pedestal. Appropriate etchant source are then flowed into the chamber and struck to form a plasma to etch exposed areas of the substrate.\nCopper (Cu) is commonly used as to interconnect microelectronic circuits on the substrate. However, before bulk Cu deposition, some type of copper sputtering deposition process is generally required to deposit a thin seed layer (about 500 \u00c5 to about 2000 \u00c5). A Cu seed layer generally provides the nucleation sites for the bulk Cu grain and film formation. That is, first a barrier layer may be deposited using a PVD (plasma vapor deposition) process, the Cu seed may then be deposited also using a PVD process, and finally the remaining bulk Cu may be deposited using electrochemical plating (ECP).\nIn general, ECP involves placing the substrate (with a Cu seed) on a plastic substrate holder. A cathode then holds the substrate with a conducting steel ring and immerses it in a plating solution containing sulfuric acid (H2SO4), copper sulfate (Cu(SO"} +{"output_text": " the POS system is preferably constructed such that the price of the commodity is displayed in a manner that the customer can visually observe the price.\nIn the POS system, the price of the commodity is displayed in a manner that the customer can visually observe the price. Thus, the POS system is preferably constructed such that the price of the commodity is displayed in a manner that the customer can visually observe the price.\nIn the POS system, the price of the commodity is displayed in a manner that the customer", "input_text": " bar code reading means of the bar code production apparatus reads a bar code of the JAN (Japanese Article Number) standard which is a standard bar code for a commodity prescribed by the Japanese Article Number, and the bar code production means produces a bar code of the CODE 128 standard and the bar code reading means of the POS apparatus is capable of reading the bar code of the CODE 128 standard.\nSince the JAN standard is normally used as an ordinary bar code standard for a commodity in Japan, the bar code reading means of the bar code production apparatus reads a bar code of the JAN standard which is a standard bar code for a commodity in Japan. Since the JAN standard allows handling of data only of 13 figures, in order to use discount information, a bar code standard is preferably used which can handle a greater number of figures than the JAN standard.\nTherefore, the bar code production means of the bar code production apparatus produces a bar code of the CODE 128 standard, and the bar code reading means of the POS apparatus is constructed so as to read the bar code of the CODE 128 standard. Accordingly, the POS apparatus can handle discount information while using the predetermined standard of the CODE 128.\nThus, the POS system is advantageous in that it is high in universal use since it basically uses an ordinary bar code standard for a commodity.\nSince the POS system to which the present invention is applied can handle discount information in this manner, it is desirable to definitely indicate to the customer upon settlement that a discount service is provided. Thus, preferably the price outputting means of the POS apparatus displays the pre-discount price and the post-discount price so as to be visually observed by a customer upon settlement for the commodity.\nIn particular, generally a POS system is constructed such that, when settlement processing is performed, the price of an individual commodity can be visually observed by a customer. Thus,"} +{"output_text": " the recommended A1C goal of less than 7% and that the majority of patients with type 2 diabetes were not receiving the recommended treatment. The report also stated that the majority of patients with type 2 diabetes were not receiving the recommended treatment. The report further stated that the majority of patients with type 2 diabetes were not receiving the recommended treatment. The report also stated that the majority of patients with type 2 diabetes were not receiving the recommended treatment. The report further stated that the majority of patients with type 2", "input_text": " linked Sulphonylureas to increased cardiovascular risk. Of further concern have been the contrasting outcomes of the ACCORD study which reported that lowering blood glucose to normal levels was associated with increased mortality, but the ADVANCE study did not report such findings. Such controversies in the results may suggest that treatment strategies for type 2 diabetes are not fully understood.\nThis begs the question if improving glycemia is sufficient to provide clinical merit in the treatment algorithm for diabetes. Currently, several therapeutic strategies include metformin in the management algorithm for type 2 diabetes with mono, di and tri therapy needing to be added to the algorithm. Therapies involving existing pharmaceuticals have limited efficacy or tolerability and show significant side effects. Many of the side effects of pharmaceuticals are thought to be associated with nutritional deficiencies caused by medications taken over a period of time ultimately resulting in a cascade of biochemical changes due to drug associated nutrient depletion. Unfortunately, long term treatment with metformin has been reported to cause vitamin B12 deficiencies. Despite the available treatment modalities, the risk of cardiovascular events has increased 2-4 fold in patients diagnosed with type 2 diabetes. As a patient's beta cell function declines, intensified treatment beyond the initial monotherapy regimen is required. The prevalence of obesity is also a concern in these patients and is thought to be a driver of cardiovascular events.\nThe \u201cState of Diabetes in America\u201d report on diabetes management evaluated current management strategies and found that, despite advances in diabetes care, blood sugar levels of millions of Americans were not controlled putting them at risk of diabetes related complications. It is possible that effective combination therapies that consist of pharmaceutical drugs and nutraceutical products may provide a new treatment algorithm that would be beneficial to diabetic patients who do not respond to drug therapy alone.\nA 2005 report from the American Association of Clinical Endocrinologists (AACE) stated that 2 out of 3 Americans with type 2 diabetes did not achieve"} +{"output_text": " of the heat.\nIn order to avoid the deficiency of rotation of the motor drive, a cooling fan is usually employed to cool the fuser. The cooling fan is usually disposed in a position adjacent to the fuser. The cooling fan is usually disposed in a position adjacent to the optical unit. The cooling fan is usually disposed in a position adjacent to the optical unit because the optical unit is usually disposed adjacent to the fuser. The cooling fan is usually disposed in a position adjacent to the optical", "input_text": " optical photoconductive drum. The laser beam is designed to draw an electrostatic image on a photoconductive cylindrical surface of the drum. Particles of toner supplied to the drum serve to visualize the electrostatic image on the photoconductive cylindrical surface. The visible image of the toner can be transferred to the printing medium, such as a sheet of paper, from the photoconductive cylindrical surface of the drum. When the transferred toner is subjected to heat, the particles of the toner are fused so that the fused toner is deposited onto the printing medium. A fuser, such as a heat roller, may be employed to fuse and deposit the particles of the toner.\nThe laser beam is in general supplied from a scanning beam generating unit or optical unit. The rotating mirror having a shape, such as a polygon, causes the laser beam, emitted from a laser, to scan across the optical photoconductive drum along the meridian. Each facet of the polygon thus generates a scanning laser beam directed to the optical photoconductive drum. A motor drive is usually employed to generate rotation of the polygon.\nIn general, the fuser is preferably disposed adjacent or closer to the optical photoconductive drum in a laser printer. Such location of the fuser and drum enables a rapid fusion of the particles of the toner which have been transferred onto the printing medium. The toner can be fused soon after it has been transferred onto the printing medium. On the other hand, it is likewise preferable that the optical unit is disposed adjacent or closer to the optical photoconductive drum. Accordingly, the optical unit is usually disposed closer to the fuser. The smaller the size of the laser printer gets, the closer to the fuser the optical unit is disposed. However, if heat generated at the fuser is conducted to the optical unit, the motor drive tends to suffer from the deficiency of rotation, such as a jitter, because"} +{"output_text": " between adjacent emitters. The SDL-3460 provides a 1.2W cw output with a 10.degree. by 30.degree., FWHM beam divergence. The available wavelength range is approximately 790 to 860 nm. The SDL-3460 laser bars are available in a variety of lengths, from about 1.5 to about 5.5 inches. The SDL-3460 laser bars are also available in a variety of output powers, from about 0.5 to about 20", "input_text": " plurality if laser diode strips. The laser diode strips are parallel to each other and are divided into groups. Typical strips are 2 to 4.mu.m wide and are spaced about 10 to 20.mu.m apart. Neighboring groups of laser diode strips are separated by strip-shaped zones extending essentially parallel to the laser diode strips. The zones substantially attenuate super-radiation or laser radiation propagating in a direction other than the prescribed emission direction parallel to the laser diode strips. Up to a maximum of about 10 to 40 laser diode strips can belong to each group, depending on the gain in the laser, the quality of the resonator and other parameters. The attenuating zones may be constructed by proton implantation, channels etched through the active region, or other means. The partitioning provided by the attenuating zones increases the maximum output power attainable from the array and improves efficiency.\nAmong the commercially available laser diodes, the 2300 series of laser sold by SDL, Inc. provides up to 4 W cw optical output power and high brightness from laser diodes with a broad area emitting aperture. For example, the SDL-2360 has a 100.mu.m wide by 1.mu.m high emitting aperture and provides a 1.2W cw output with a 10.degree. by 30.degree., FWHM beam divergence. The available wavelength range is approximately 790 to 860 nm. The laser output is modulatable with rise and fall times of about 500 ps (2 GHz modulation bandwidth). Among the linear array laser diodes sold by SDL, Inc. are the SDL-3400 series of laser bars. For example, the SDL-3460 has 18 laser emitters driven in parallel and providing up to 20W cw optical output power or about 1.1W per emitter. Each emitter has a 200.mu.m wide emitting aperture with a 30.mu.m gap"} +{"output_text": " watches, the reflection-type liquid crystal display devices are mainly used.\nThe backlight device is generally composed of a light source, a light guide plate, a reflection plate, a diffusion plate, and a prism sheet. The light source is composed of a cold cathode fluorescent lamp (CCFL) or a hot cathode fluorescent lamp (HCFL). The light guide plate is composed of a transparent plate having a thickness of about 1 mm to about 5 mm. The reflection plate is composed of a white", "input_text": " to occur. This applies in particular to a proton exchange membrane fuel cell which is especially sensitive to CO poisoning because of its low operating temperatures.\nOne technique for increasing electrocatalytic cathodic activity during the reduction of oxygen and electrocatalytic anodic activity during the oxidation of hydrogen is to employ an electrocatalyst which is more active, corrosion resistant, and/or more poison tolerant. For example, increased tolerance to CO has been reported by alloying platinum and ruthenium at a 50:50 atomic ratio (see, D. Chu and S. Gillman, J. Electrochem. Soc. 1996, 143, 1685). The electrocatalysts proposed to date, however, leave room for further improvement. Applicants claim priority under 35 U.S.C. xc2xa7119 of Japanese Application No. 338787 filed Nov. 29, 1999. Applicants also claim priority under 35 U.S.C. xc2xa7365 of PCT/JP00/08206 filed Nov. 21, 2000. The international application under PCT article 21(2) was not published in English.\nThis invention relates to a tungsten sealing glass to be used for a glass tube in a fluorescent lamp which serves as a light source of a lighting equipment for a liquid crystal display device or the like.\nLiquid crystal display devices are broadly classified, depending upon manners for utilizing light sources, into a reflection-type of liquid crystal display devices using natural light or light from room lighting, and a transmission-type of liquid crystal display devices using light from a dedicated lighting equipment, for example, a backlight device. For those applications, such as notebook-type personal computers, TV monitors, and in-vehicle instruments or indicators, which require a high-quality display, the transmission-type liquid crystal display devices with the backlight device are mainly used. For wrist"} +{"output_text": "ographic processes are used to produce the master. The master is then coated with a metal layer or a metallic coating and the negative-microstructured metal layer is removed from the master to obtain a metal plate that can serve as a matrix in the compression moulding, embossing and/or injection moulding press.\nThe master is normally produced by means of a lithographic process. The master is then coated with a metal layer or a metallic coating and the negative-microstructured metal layer is removed", "input_text": " memory page. For instance, if ten processors, each with its own sequential memory access pattern, attempt to access the same DRAM bank simultaneously and each of the accesses is to a different memory page, the spread and memory latencies between the fastest and slowest responses might be more than 25:1.\nThe present invention is directed to allowing a high rate of transfer to memory and I/O devices for tasks which have real-time requirements. The present invention is also directed to allowing the system to buffer I/O requests from several processors within a multiprocessor at once with a non-blocking load buffer. Furthermore, the present invention is directed to extending the basic non-blocking load buffer to service a data processing system running real-time processes of varying deadlines by using scheduling of memory and peripheral accesses which is not strictly FIFO scheduling. In respect of replicating microstructures on plastic elements produced in a machine of the kind defined in the introduction, it is known to produce first an original master in some suitable way, and then to produce a matrix for use in said machine on the basis of this master. Matrices of this kind can be produced by coating a master or an original that has a positive microstructure on one surface with a metal layer or a metallic coating and removing the negative-microstructured metal layer from the master to thereby obtain a metal plate that can serve as a matrix in the compression moulding, embossing and/or injection moulding press. Normally each mould half can have its own matrix and a flowing, hot (approximately 400\u00b0 C.) plastic mass is pressed under high pressure into a delimited mould cavity formed by cavities in brought together mould halves. The flowing hot plastic mass is then allowed to solidify (at approximately 140\u00b0 C.) between the brought together mould halves before the mould halves are opened and the solidified element can be pressed out.\nLith"} +{"output_text": " RF signal received by the receiver, the lower the data rate at which the transmitter encodes the data. For example, if the RSSI is less than \u2212100 dBm, the transmitter may select a 1 or 2 megabits-per-second encoding protocol. If the RSSI is between \u2212100 dBm and \u221250 dBm, the transmitter may select a 5.5 or 11 megabits-per-second encoding protocol. If the RSSI is between \u221250 dBm and", "input_text": " more local oscillations to produce RF signals. The power amplifier amplifies the RF signals prior to transmission via an antenna.\nAs is also known, the receiver is coupled to the antenna and includes a low noise amplifier, one or more intermediate frequency stages, a filtering stage, and a data recovery stage. The low noise amplifier receives inbound RF signals via the antenna and amplifies them. The one or more intermediate frequency stages mix the amplified RF signals with one or more local oscillations to convert the amplified RF signal into baseband signals or intermediate frequency (IF) signals. The filtering stage filters the baseband signals or the IF signals to attenuate unwanted out of band signals to produce filtered signals. The data recovery stage recovers raw data from the filtered signals in accordance with the particular wireless communication standard.\nAs is further known, the transmitter of a wireless communication device transmits RF signals that represent baseband processed data to the receiver of another wireless communication device directly or through an access point, or base station. The particular type of baseband processing used to prepare the data for radio frequency transmission and subsequent data recapture by the receiver is dependent upon the standard, or standards, being supported by the wireless communication devices and upon the received signal strength of the RF signals. For example, if the standard being supported is IEEE802.11g, the baseband processing may include encoding data at 1 or 2 megabits-per-second using a direct sequence spread spectrum (DSSS) encoding protocol, a 5.5 or 11 megabits-per-second complimentary code keying (CCK) encoding protocol, or a 6, 9, 12, 18, 24, 36, 48, or 54 orthogonal frequency division multiplexing (OFDM) encoding protocol.\nThe particular encoding protocol selected is at least partially based on received signal strength indication (RSSI). In general, the weaker the signal strength of the"} +{"output_text": " may be provided by a separate database. The in-vehicle system also includes a processing device which is programmed to compare the vehicle\"\"s heading, as determined by the GPS receiver, with the heading of the approaching intersection. If the vehicle\"\"s heading is within a predetermined range of the heading of the approaching intersection, then the processing device determines that the vehicle is approaching the intersection at a safe speed and does not require the driver to brake. If the vehicle\"\"s heading is not within the predetermined range", "input_text": " if the vehicle does so begin to enter the intersection, then alert the driver, by means of an alarm or indication, to brake the vehicle prior to its entry into the intersection.\nThese and other objects are achieved by providing an in-vehicle system for determining and warning of potential violation of intersection traffic control devices. The in-vehicle system features a data storage device, a processing device, a Global Positioning System (xe2x80x9cGPSxe2x80x9d) receiver, a geographical information system (xe2x80x9cGISxe2x80x9d) digital database, an interface for entering and/or editing data into the data storage device, and an interface for alerting the driver of any impending violation of a traffic control device. The in-vehicle system may also include a visual or aural display for providing the driver with in-vehicle system-related information, a capability for communicating with an oncoming traffic control device to determine its status, and a capability for braking the vehicle without driver assistance, such as an auxiliary braking system or an interface which allows the in-vehicle system to gain control of the vehicle\"\"s primary braking system.\nThe in-vehicle system features a GPS receiver which generates the latitude and longitude, heading, and velocity of the equipped-vehicle. The in-vehicle system utilizes a GIS database and correlates the vehicle\"\"s latitude, longitude and heading, as generated by the GPS receiver, with approaching latitude and longitude of intersections. The GIS database includes a geographic location digital database that contains the positional data (e.g., latitude and longitude) and informational data (e.g., number of lanes) of all roadways within the geographic region covered by the database. The GIS database may also include intersection data (e.g., the types and locations of traffic control devices at each intersection) or such data"} +{"output_text": "alon from the other components of the wavelength locker. The etalon is typically mounted in a cavity formed in the circuit board, and the cavity is typically filled with a dielectric material. The cavity is typically filled with a dielectric material to provide a hermetic seal to prevent moisture from entering the cavity and to provide a thermal path for heat generated by the thermoelectric cooler to escape the cavity. However, the dielectric material typically used to fill the cavity is a poor thermal conductor, and the cavity", "input_text": " sufficiently inexpensive for practical use. Difficulties arise in each of these areas.\nEach WTL for use in a DWDM is typically provided in a miniature \u201cbutterfly\u201d package for mounting to a circuit board also containing microcontrollers and other components. The circuit boards are mounted in a parallel array within the DWDM with, typically, one board per ITU channel. Hence, a forty ITU channel DWDM employs forty circuit boards; an eighty ITU channel DWDM employs eighty circuit boards. Current state-of-the-art WTLs typically draw about ten watts of power, thus requiring 400 watts of power or more for the a forty channel DWDM and correspondingly more power for 80 or 160 channel DWDMs. A significant portion of the power is consumed by thermoelectric (TE) coolers provided for controlling the temperature of the semiconductor laser of the WTL. With the WTLs already consuming considerable power, it is particularly important that the wavelength locker be configured so as to minimize power consumption, particularly the temperature-controlled etalon. Minimizing power consumption, however, typically requires that the etalon be configured to provide numerous closely-spaced transmission peaks (i.e. to have a narrow free spectral range) such that relatively little heating or cooling is required to expand or contract the etalon or change its index of refraction sufficiently enough to align one of the transmission lines of the etalon with a selected ITU grid line. Using numerous closely spaced peaks, however, increases the risk that the wavelength locker will lock the transmission wavelength of the WTL to the wrong wavelength. Also, to provide numerous closely-spaced transmission peaks, the etalon typically must be configured to have a very short optical axis, thereby making it more difficult to fabricate and align.\nMoreover, difficulties arise in adequately insulating the temperature-controlled et"} +{"output_text": " pellets, etc.) or a single particle. The solid support material can be in the form of a porous material, such as a porous glass, porous silica, porous polymer, etc. The solid support material can be in the form of a porous glass, porous silica, porous polymer, etc. The solid support material can be in the form of a porous glass, porous silica, porous polymer, etc. The solid support material can be in the form of a porous glass, porous silica, porous polymer", "input_text": " a monomer using the initiator of Formula I, to produce polymers having an azlactone group at one terminal end, and then subsequently reacting the polymers with a polyfunctional compound of the formula R5(ZH)p, where p is at least 2.\nAlthough the repeat unit of the oligomeric initiators is not shown in Scheme III, it will be appreciated that the xe2x80x94ZH groups of a multifunctional compound R5(ZH)p may react with the multiple, pendent azlactone groups on the same oligomer, or the azlactone groups on different oligomers to form a crosslinked composition. Additionally it will be appreciated that the multiple R5 groups depicted in Formula II may be the same R5 group whose multiple ZH groups react with adjacent azlactone groups on the same oligomer.\nIn another embodiment, the multifunctional initiators may comprise a solid support having a plurality of initiator moieties on the surface thereof. Such initiator-functionalized supports have the general structure (corresponding to Formula II): \nwherein X, R1, ON(R2)2, R3, R4, X, Z, q, p and n are as previously described for Formula II and SS is a solid support corresponding to R5. For clarity, the repeat unit of the oligomeric initiator is not shown in Formula IV. The solid support material includes functional groups to which initiator molecules of Formula I can be covalently attached for effecting polymerization on the solid surface. Useful functional groups include hydroxyl, amino and thiol functional groups corresponding to xe2x80x94ZH.\nThe solid support material (SS) can be organic or inorganic. It can be in the form of a solid, gel, glass, etc. It can be in the form of a plurality of particles (e.g., beads,"} +{"output_text": ". It is also desirable that such techniques be applicable to a variety of downhole tools and sampling configurations.", "input_text": "b employs a probe 2b to sealingly engage and draw fluid from the formation 16, in similar fashion to the downhole tool 1a and probe 2a described above.\nIt is therefore desirable that sufficiently \u201cclean\u201d or \u201cvirgin\u201d fluid be extracted or separated from the contaminated fluid for valid testing. In other words, the sampled formation fluid should have little or no contamination. Attempts have been made to eliminate contaminates from entering the downhole tool with the formation fluid. For example, as depicted in U.S. Pat. No. 4,951,749, filters have been positioned in probes to block contaminates from entering the downhole tool with the formation fluid.\nOther techniques directed towards eliminating contaminates during sampling are provided by published U.S. Patent Application No. 2004/0000433 to Hill et al. and U.S. Pat. No. 6,301,959 to Hrametz et al., the entire contents of both being hereby incorporated by reference. FIGS. 3 and 4 are schematic illustrations of the probe solution disclosed by the Hrametz patent. Hrametz describes a fluid sampling pad 13 mechanically pressed against the borehole wall. A probe tube 18 extends from the center of the pad and is connected by a flowline 23a to a sample chamber 27a. A guard ring 12 surrounds the probe and has openings connected to its own flowline 23b and sample chamber 27b. This configuration is intended to create zones so that fluid flowing into the probe is substantially free of contaminating borehole fluid.\nDespite such advances in fluid sampling, there remains a need to reduce contamination during formation evaluation. In some cases, cross-flow between adjacent flowlines may cause contamination therebetween. It is desirable that techniques be provided to assist in reducing the flow of contamination of formation fluid entering the downhole tool and/or isolate clean formation fluid from contaminates"} +{"output_text": " to the cell container, or to the driving system. For example, U.S. Pat. No. 4,948,769 discloses a technique for improving the lifetime of the image display element by adding a compound having a standard oxidation-reduction potential of 0.7 v or above to the color forming and bleaching substance.\nFurther, U.S. Pat. No. 4,948,769 discloses a technique for improving the lifetime of the image display element by adding", "input_text": ", in which the color forming and bleaching medium consisting of N,N'-di(p-cyanophenyl)-4,4'-bipyridinium salt, potassium chloride, sodium ferrocyanide, diluted sulfuric acid, etc., is used.\nU.S. Pat. No. 3,806,229 describes an image display apparatus, wherein a color forming and bleaching medium consisting of salts of dipyridinium compounds, and an adjuvant such as substituted hydroquinones having a standard oxidation-reduction potentials of 0.7 v and above, ferrous salts, or 1,4-di(dialkylamino)benzenes, etc, is held between opposing electrodes.\nFurther, U.S. Pat. No. 3,930,717 discloses a similar type of image display device.\nIn either of the abovementioned image display devices as taught in the prior patents, however, there has been pointed out that repetitive durability of the image display element, i.e., its lifetime, constitutes a problem. In more detail, such phenomena as insufficiency in color forming, insufficiency in color bleaching, occurrence of side-reaction in the color forming, occurrence of irregularity in the formed color, changes in color tone, etc. remarkably curtail the lifetime of the image display element. The main cause for such shortened service life of the element is presumed to be electrode contamination. The contamination is said to be caused by various factors such as impurities contained in the electrochemical color forming and bleaching substance, products from chemical changes in such electrochemical color forming and bleaching substance, impurities discharged from the cell container, inadequacy in the driving system, and others, all these factors being combined sophisticatedly.\nTo improve such disadvantages, there have been proposed improved techniques concerning new adjuvants or auxiliary agents to be added to the color forming and bleaching substance, or"} +{"output_text": " programming were developed. The first was the local access channel (LACH) which was a dedicated channel that provided local programming to subscribers. The second was the local commercial insertion channel (LCIC) which was a dedicated channel that provided local advertising to subscribers. The third was the local origination channel (LOC) which was a dedicated channel that provided local programming to subscribers. The fourth was the local commercial insertion channel (LCIC) which was a dedicated channel that provided local advertising to subscribers. The fifth", "input_text": " on the basis of their ability to specifically bind their ligand. The specificity of binding is defined in terms of the dissociation constant Kd of the aptamer for its ligand. Aptamers can have high affinity with Kd range similar to antibody (pM to nM) and specificity similar/superior to antibody (Tuerk and Gold, Science, 249:505, 1990; Ellington and Szostak, Nature, 346:818, 1990).\nMany aptamers have a stem-loop structure in which the bases in the loop and the stem are intimately involved in interaction with the ligand. RNA aptamers have been isolated against the protease-sensitive, N-terminus of PrP (Weiss et al., J. Virol. 71:8790-8797, 1997) but these do not discriminate between PrPC and PrPSc and are sensitive to nucleases. Therefore, there is a need in the art to design and utilize aptamers for binding to specifically folded prions, specifically those prions that are infectious and disease-causing in animals/mammals, in order to prevent the transmission and spread of such diseases in the food supply. The present disclosure provides improved aptamers for detecting the presence of PrPSc where the aptamers are not sensitive to nucleases. Distribution of full motion video data has evolved from early television broadcasting to meet viewer demand. Earliest video distribution was by point-to-point wiring between a camera and a video monitor. This was followed by scheduled television broadcasting of programming over the public air waves. In the 1960s, Community Antenna Television (CATV) was chartered to provide off-air television signals to viewers in broadcast reception fringe areas. Later, under FCC regulation, the CATV industry was required to provide local access and original programming in addition to off-air broadcast signal distribution.\nIn response, several sources of cable network"} +{"output_text": " when the polymer dielectric materials are exposed to high temperatures; and (6) The overall mobile electrostatic carrier lifetime may be affected by the metal electrodes, especially when the metal electrodes are exposed to high temperatures. \nIn addition, the bipolar MESC design is not suitable for use with a thin wafer that is larger than the MESC itself. For example, the bipolar MESC design is not suitable for use with a thin wafer that is larger than the MESC itself because the", "input_text": ". In contrast to a unipolar MESC, the thin wafer does not need to be electrically contacted for charging and discharging because the capacitor is formed between the two electrodes or multiple pairs of electrodes. Such a bipolar MESC is usually made from metal electrodes and polymer dielectric layers; therefore it is limited in terms of thin wafer thermal process and wet chemical process capabilities. As shown in FIG. 1B, bipolar MESC 20 has both of electrodes of opposite polarity (negative electrodes 22 and positive electrodes 28) embedded under dielectric layer 24 and in the MESC itself. This bipolar MESC design relies upon the electric field generated between the two electrodes to hold thin wafer 26 in place. When using a bipolar MESC, during the chucking and dechucking, the thin wafer does not need to be electrically contacted.\nCurrent bipolar mobile electrostatic carriers are often made from metallic electrodes and polymer dielectric layers, because of which the overall performance of the MESCs is limited with some of the following concerns: (1) The existence of metal and polymer limits the thin wafer processing temperature to be typically less than 300\u00b0 C., which means that current MESCs cannot be reliably used for wafer processing much above 300\u00b0 C.; (2) The thin wafer and processing equipment may be contaminated by the MESC structural materials, especially when processed at elevated temperatures; (3) The thermal (TCE) mismatch between the MESC structural materials (metal & polymer) and the thin semiconductor wafer may cause warpage or even breakage of the thin wafer (and/or formation of microcracks); (4) The MESC structural materials (metal & polymer) may not be chemically compatible with commonly used dry and wet chemical etching and deposition processes; (5) The overall mobile electrostatic carrier lifetime may be affected by the dielectric qualities of the polymer dielectric materials, especially"} +{"output_text": " said receiving user.\nAccording to one embodiment of the invention, the step consisting in generating said second marked complementary flow includes steps wherein: a second modified complementary flow comprising complementary digital information capable of allowing the restoration of the nominal content from the modified content is generated, marking information are determined so as to enable the restoration of said marked audiovisual sequence from said nominal audiovisual sequence, said marking information being determined further to the operation of marking said nominal audiovisual sequence", "input_text": " content and said modified content includes steps wherein: a second modified complementary flow comprising complementary digital information capable of allowing the restoration of the nominal content from the modified content is generated, at least a piece of marking information is determined as a function of the bit differences between the marked content and the nominal content; said marked complementary digital information are determined as a function of said complementary information and said marking information.This embodiment has the advantage of being implemented on a known protection module. In this embodiment, said marking information, said complementary information and said marked complementary information can have an identical format. This more particularly makes it possible to make the transmission method even safer. \nBesides, in one embodiment, the step consisting in generating said second marked complementary flow includes steps wherein: a second modified complementary flow comprising complementary digital information capable of allowing the restoration of the nominal content from the modified content is generated, marking information are determined so as to enable the restoration of said marked audiovisual sequence from said nominal audiovisual sequence, said marking information being determined further to the operation of marking said nominal audiovisual sequence; said marked complementary digital information are determined as a function of said complementary information and said marking information, wherein said marking information, said complementary information and said marked complementary information have an identical format.According to another embodiment of the invention, the step consisting in determining a difference between said marked content and said modified content includes steps wherein: said marked content and said modified content are compared at the bit level so as to determine said difference.This more particularly makes it possible to easily obtain the difference between the marked content and the modified content. \nIn order to obtain a marked audiovisual sequence which is also customized, said marked complementary information may include a customization identifier. This customization identifier can include a single identifier of said receiving item of equipment and/or a single identifier of"} +{"output_text": " in a computer system. The method comprises the steps of:\n(a) providing a plurality of access control lists (ACLs) each of which is associated with a respective one of a plurality of different types of information in the computer system;\n(b) providing a plurality of access control entries (ACEs) each of which is associated with a respective one of the plurality of different types of information in the computer system;\n(c) providing a plurality of access control entries (ACE", "input_text": " produced with invention catalyst system have utility according to molecular weight, level of comonomer incorporation, where included, and polydispersity (xe2x80x9cMWDxe2x80x9d), etc. for their conventional and known uses. Thus films, fibers, and moldable thermoplastics by any of the known means of melt processing and subsequent extrusion, and/or, thermoforming are typical applications. In such, inclusion of additives such as processing aids, stabilizers, pigments, fillers as conventionally known can be utilized. High density polyethylene and isotactic polypropylene films, including those that are oriented in one or both axes and those modified with other components such as hydrocarbon tackifier resins are specific examples.\nFurther, inclusion of other thermoplastic components both in greater and lower amounts will be useful as known for various polymer blends and compositions. Thus the use of elastomeric polyolefins of the invention for impact modification of polar engineering resins or in co-vulcanizable elastomer blends (typically when containing diolefin comonomer and/or further derivatized as by free-radical grafting of polar monomers) is suitable. For a preferred derivatization process see WO-A-93/12148 and equivalent U.S. Pat. No. 5,424,367. The present invention relates to a method and apparatus for controlling access to and corruption of information in a computer system.\nPCT/GB91/00261 (WO91/13403) also by the present inventors (the disclosure of which is incorporated herein by reference) discloses a method and apparatus particularly concerned with the detection and containment of hostile programs such as xe2x80x9cvirusxe2x80x9d programs within computer systems. In this document there is disclosed a method of (and related apparatus for) controlling access to and modification of information"} +{"output_text": " from the optical axis, thereby obtaining the SEM image.\nIn the SEM, the electron beam is scanned on the specimen, and the secondary electrons and the back scattered electrons are detected. The secondary electrons and the back scattered electrons are detected by a detector which is separated from the optical axis of the electron beam. The secondary electrons and the back scattered electrons are detected by a detector which is separated from the optical axis of the electron beam. The secondary electrons and the back scattered electrons are detected by a detector", "input_text": ", if the FET (T101) having a shorter transition period than the set transition period is used, masking is carried out by the start timer 103 irrespective of the achievement of the overcurrent detecting function. As a result, there is generated a period for which the precious overcurrent detecting function cannot be used.\n(2) When a type of the FET (T101) to be used as a semiconductor switch is changed, a gate capacity of the FET (T101) is varied so that the transition period is changed. Thus, the transition period is changed depending on a structure of a gate circuit or a gate characteristic of the FET. Correspondingly, a duration of the start timer 103 is to be set. With a structure in which the overcurrent detecting device is provided in an IC, it is necessary to add a regulating terminal for regulating the timer duration on the outside of the IC to an IC package and to add a regulating circuit to the outside of the IC. This causes an increase in a cost. In a scanning electron microscope (SEM), a specimen is irradiated with an electron beam while scanning the electron beam. Then, a signal by means of electrons emitted from the specimen is converted into an intensity modulation input into a CRT, thus obtaining a scanned image (SEM image) of the specimen.\nJapanese Patent Laid-open No. 7-192679 discloses that electrons emitted from a specimen are separated into secondary electrons and back scattered electrons so as to obtain the SEM image, thereby observing geometric and material information of the specimen with a higher resolution. Moreover, the Japanese Patent Laid-open No. 7-192679 discloses that, by using Wien filter type electromagnetic field which deflects the secondary electrons and the back scattered electrons from an optical axis of the electron beam, a detector is brought away from the optical axis, and thus the secondary electrons and the back scattered electrons are separated"} +{"output_text": ". The single crystal silicon wafer can be bonded to the oxide-coated silicon wafer by a variety of processes, including direct bonding and fusion bonding.\nIn the direct bonding process, two silicon wafers are placed in contact with each other and then heated to a temperature of about 400\u00b0 C. to about 1200\u00b0 C. for a period of time to cause the two wafers to bond together. The direct bonding process is described in U.S. Pat. No. 5,200,262 to F", "input_text": " methods for making same.\nTo date, the semiconductor material most commonly used in semiconductor-on-insulator structures has been silicon. Such structures have been referred to in the literature as silicon-on-insulator structures and the abbreviation \u201cSOI\u201d has been applied to such structures. Silicon-on-insulator technology is becoming increasingly important for high performance photovoltaic applications (e.g., solar cells), thin film transistor applications, and displays, such as, active matrix displays. Known silicon-on-insulator wafers consist of a thin layer of substantially single crystal silicon (generally 0.1-0.3 microns in thickness but, in some cases, as thick as 5 microns) on an insulating material.\nFor ease of presentation, the following discussion will at times be in terms of silicon-on-insulator structures. The references to this particular type of semiconductor-on-insulator structure are made to facilitate the explanation of the invention and are not intended to, and should not be interpreted as, limiting the invention's scope in any way. The SOI abbreviation is used herein to refer to semiconductor-on-insulator structures in general, including, but not limited to, silicon-on-insulator structures. Similarly, the SOG abbreviation is used to refer to semiconductor-on-glass structures in general, including, but not limited to, silicon-on-glass structures. The SOG nomenclature is also intended to include semiconductor-on-glass-ceramic structures, including, but not limited to, silicon-on-glass-ceramic structures. The abbreviation SOI encompasses SOG structures.\nThe various ways of obtaining SOI structures include epitaxial growth of Si on lattice matched substrates. An alternative process includes the bonding of a single crystal silicon wafer to another silicon wafer on which an oxide layer of SiO2 has been grown"} +{"output_text": " will explain the SHV projector with reference to FIG. 4.\nThe SHV projector is constructed by combining a first modulation optical system similar to the first modulation optical system of FIG. 1 with a second modulation optical system similar to the second modulation optical system of FIG. 2. The first modulation optical system is constructed by combining a first lens array (L1) with a first lens (L2) and a second lens (L3) with a second lens (L4). The second modulation", "input_text": " an image to be displayed on the screen is influenced by an optical F-number and the performance of the display device, there is no possibility that the image is displayed at a contrast value exceeding the proportion of thousands to one (thousands:1) in a situation of ensuring appropriate brightness. On the contrary, the optical system of FIG. 2 is constructed so as to project an image on a screen (not shown) after once forming an image, which has been brought by the first modulation optical system similar to FIG. 1, on the Y device 40 for a further modulation. Consequently, the contrast of the image displayed on the screen becomes equal to or more than the proportion of a million to one (a million: 1) as a result of multiplying a contrast value of the first modulation optical system by a contrast value of the second modulation optical system.\nIn the projector adopting the optical system of FIG. 2, however, there exists a reality that the resolving power (i.e. number of pixels) of the Y device 40 determines a final resolving power of an image projected on the screen. In even a highest-definition device produced in the market currently, this resolving power would be 4 k\u00d72 k pixels (horizontal: 4,096 pixels, vertical: 2,160 pixels) at the highest.\nUnder such a situation, there is recently proposed a projector of FIG. 3 in order to attain a higher resolving power (8 k\u00d74 k pixels). This projector is one proposed by Japan Broadcasting Corporation, which is referred to as \u201cSuper Hi-Vision (SHV)\u201d. Here, the super Hi-Vision is one of a LSDI (Large Screen Digital Imagery) system with 7680\u00d74320 pixels specified in Recommendation ITU-R BT.1769 \u201cparameter values for an expanded hierarchy of LSDI image formats for production and international program exchange\u201d. We"} +{"output_text": " organism is a commensal, which is a member of the normal flora of the host. In other cases, the colonizing organism is a pathogen, which is a member of the normal flora of the host that is capable of causing disease.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer interconnection structure.\nIn recent years, the integration density of semiconductor devices has been increased, and the multilayer", "input_text": ") within the envelope will migrate to the ambient environment. The envelope also includes a plurality of pins extending outwardly from the envelope to discourage non-targeted animals from attempting to ingest the envelope or its contents.\nIn another preferred embodiment, the envelope of the insecticide packet is protected by an outer pocket. The pocket is provided with a second permeable web that is in contact with permeable web of the envelope to facilitate the migration of volatile substances from the envelope to the ambient environment. A plurality of pins extends outwardly from the pocket.\nIn yet another preferred embodiment, the envelope of the insecticide packet is enclosed within a pocket that has holes to expose the permeable web of the envelope to the ambient environment.\nAnother preferred embodiment of the present invention includes an insecticide packet and a buoyant support arranged so that at least a portion of the permeable web of the envelope is above the water line when the packet and buoyant support float on the surface of a water body.\nPreferably, the insecticide includes a stomach poison for the types of biting insect targeted by the insecticide package mixed with a food that is preferred by the biting insects. For example, a combination of boric acid and cattle blood may be effective to kill female biting insects, including mosquitoes. Volatile organic compounds, such as odoriferous compounds from the food itself, may also be included with the insecticide to attract the biting insects to the insecticide packet. Most bacterial diseases begin with colonization of a particular mucosal surface (Beachey et al., 1981, J. Infect. Dis. 143:325\u2013345). Successful colonization requires that an organism overcome mechanical cleansing of the mucosal surface and evade the local immune response. The process of colonization is dependent upon specialized microbial factors that promote binding to host cells (Hultgren et al., 1993 Cell, 73:887\u2013901). In some cases the colonizing"} +{"output_text": " are typically used for high pressure applications, and are typically designed for HPLC. They are typically used in conjunction with a high pressure pump, and are typically used in conjunction with a high pressure detector. They are typically used in conjunction with a high pressure pump, and are typically used in conjunction with a high pressure detector. They are typically used in conjunction with a high pressure pump, and are typically used in conjunction with a high pressure detector. They are typically used in conjunction with a high pressure pump, and", "input_text": " the most back pressure of any pump type available, and can deliver very precise and easily regulated flow velocities. They provide significantly pulsing flow, pump unidirectionally, and most types require the incorporation of check valves. This type of pump can easily be designed to deliver micro-flow capability, and can be configured to drive multiple heads with a single drive unit. This type of pump is suitable for FIA, provided the detector can tolerate some flow pulsation. It cannot be used in SIA due to its unidirectional pumping action. It is the pump of choice for most HPLC work, and in instrumentation for process control, due to its high reliability. ELDEX is a manufacturer of a unit typical of this type of pump. Piston pumps for instrumentation are usually designed primarily for HPLC, which requires pressure capabilities significantly higher than FIA or SIA (1000-5000 psi), and thus makes them relatively expensive, which expense increases rapidly if multiple streams must be pumped.\nA unique subset of piston pumps is produced by FMI, Inc. This pump utilizes a special pumping cycle incorporating a piston that simultaneously reciprocates (providing pumping action), and rotates (providing valving action). This type of pump is self-priming, can pump against significant back pressure (but less than pumps designed specifically for HPLC), has no check valves, can deliver bi-directional pumping action, and allows very precise and easily regulated flow velocities. The pumping action is inherently pulsating. These pumps can easily deliver micro-flow capability, and can be configured to drive multiple heads with a single drive unit. They are suitable for FIA, and as they are bi-directional, can be used in SIA (again, with the limitation that the detector tolerate flow pulsation). They are not suitable for HPLC due to low output delivery pressure.\nSyringe pumps are essentially very large piston pumps. They"} +{"output_text": " of the child, the parent may also want to ensure that the child does not spend too much money on the phone service. For example, the parent may want to ensure that the child does not spend too much money on text messaging, or on games, etc.\nAnother partial solution to the problems associated with postpaid cellular phone abuse is the pay-as-you-go cellular phone. In a pay-as-you-go phone, the user of the phone can use the phone for", "input_text": " issue exists between employers and employees and other parties in similar administrator/user relationships with respect to the use/abuse of cell phones and other devices. For example, an employer may want an employee to have a communications or mobile computing device, but may not want to pay for certain services or applications that the employee can access with the device or may want to limit how, when, and how much of those services or applications can be used by the employee. Likewise, a government agency or school might be willing to pay for or subsidize certain communications services or applications, but not others. Without the ability to somehow restrict the employee's ability to use services or applications that the employer does not want to pay for or to shift payment obligations for those services or applications to the employee, many employers are forced to give their employees the devices anyway and hope for the best.\nOne partial solution to the problems associated with postpaid cellular phone abuse is the prepaid cellular phone. Prepaid phone services limit spending because the user of the phone can only use what has been paid for in advance. Many children, however, are not responsible or mature enough to adequately track and maintain their prepaid phone service accounts, and many parents have too many other obligations to keep close track of their children's cell phone use, so as to make sure the phone service accounts are adequately funded all of the time. The net result can be disastrous. For example, if a child uses up all of the funds in their prepaid account, and their phone service provider shuts down access to its services, the child will not be able to call a parent in the event of an emergency, or arrange to be picked up after school or a sporting event, etc.\nThus, a prepaid phone service does not solve the problem of ensuring availability of key services even if the prepaid account has run out of money. In addition to insuring the safety"} +{"output_text": "opolymerization initiation system is described (JP-A-2001-277771). However, a specific compound does not disclosed in the patent. Also, the photopolymerizable composition is required to have a high sensitivity and a high resolution.\nIn the photopolymerizable composition, a photopolymerization initiator is used in order to increase the sensitivity and the resolution. However, the photopolymerization initiator is generally a radical polymerization initiator, and the radical polymerization initi", "input_text": " a mask material for paste printing to a printed circuit board or the like, a mask for forming a pattern having approximately 100 to 200 \u03bcm utilizing a photo-decomposable resin sheet and a production method of the mask are described (JP-A-8-258442). However, a specific compound does not disclosed in the patent. Also, the controlled development processing is indispensable in order to form the pattern while regulating the degree of exposure and development.\nOn the other hand, in order to form a pattern in a thick layer by a simple process, for example, pattern-formation by laser processing is known, in which the base material per se is removed, deformed or discolored by imagewise irradiation of laser beam. For instance, a method of recording information, for example, a lot number on a product (for example, video tape or home electric appliances) composed of a variety of base materials as utilized as a laser maker. In such cases, conventional resins are used as they are as the base material.\nIn the pattern-formation by laser processing, it is desired that a laser engraving portion (concave portion) be rapidly formed. For this purpose, a high-sensitive laser-decomposable resin composition and a high-sensitive laser-decomposable pattern-forming material is needed.\nIn particular, in case of a flexographic printing plate precursor of a direct drawing type by laser (so-called flexographic printing plate precursor for laser engraving), since ease of engraving by laser beam (engraving sensitivity) dominates plate-making speed, a flexographic printing plate precursor for laser engraving using a high-sensitive laser-decomposable resin composition has been required.\nOn the other hand, a photopolymerizable composition containing a polyurethane resin and a lithographic printing plate precursor for laser scanning exposure using the photopolymerizable composition in a phot"} +{"output_text": " large number of end points, are based on the Ethernet protocol. In this case, the data is transmitted in the form of frames. The frames are transmitted in the form of data packets, which are encapsulated in the frames. The data packets are transmitted via the data network on the basis of the Ethernet protocol. The data packets are routed via the data network on the basis of the Ethernet protocol. The data packets are routed via the data network on the basis of the Ethernet protocol. The data packets are", "input_text": " waste mud contains saline water, disposal of the aqueous fraction may pose a problem. Although salt concentrations in \u201csaline\u201d groundwaters are low compared to marine waters, they are often sufficiently high to damage soils and bodies of freshwater. Saline water may be disposed of by storage in a lagoon, in which the water slowly evaporates and the salt precipitates. Although this method greatly reduces the volume of the waste material, the concentrated salt evaporite that remains can be highly damaging to soil and groundwater, and requires either alternative disposal or further treatment. Another method of disposal is permanent storage of the saline water in an impermeable landfill. This method is expensive, may result in leaks, and is not available in every location.\nConsequently, there is a long-felt need in the art for a method of waste drilling mud disposal that requires no disposal of hydrocarbons and creates no persistent pollution. There is a further long-felt need in the art for a method of waste drilling mud disposal that requires no disposal of saline water. There is a further long-felt need in the art for a method of treatment of waste drilling mud that requires no disposal of hazardous pollutants. There is a further long-felt need in the art for a method of cost-effective diesel recycling from drilling mud. In data networks, devices are linked to one another via connections in order to interchange data with one another. With regard to the devices, a distinction is drawn between central devices, for example servers, and end points, for example PCs. End points such as these are frequently also referred to as clients. In general, the devices communicate with one another in data networks on the basis of allocated addresses. When the data is interchanged in a data network on the basis of the Internet Protocol, the addresses which are used are so-called IP addresses (IP=Internet Protocol).\nMany data networks, in particular those with a"} +{"output_text": ") The voice signal is inputted to the microphone 1. The voice signal is converted to the digital signal by the microphone 1. The digital signal is inputted to the linear codec 3. The linear codec 3 converts the digital signal to the voice signal. The voice signal is inputted to the DSP 4. The DSP 4 processes the voice signal. The voice signal is inputted to the ROM 52. The ROM 52 stores the voice coding procedures used in the DSP 4. The voice", "input_text": "A need therefore exists for a high voltage bushing that overcomes the aforesaid technical difficulties and reduces the electric field in the region of the \u201ctriple point\u201d to prevent the formation of corona when the bushing is wet. 1. Field of the Invention\nThe invention relates to a radio communication apparatus used for both digital communications and analog communications.\n2. Description of the Prior Art\nFIG. 3 shows a block diagram of a conventional radio communication apparatus. In FIG. 3, a microphone 1 changes human voice to voice signal. A linear codec 3 converts the voice signal to a digital signal. A digital signal processor (DSP) 4 processes the digital signal. A ROM 52 stores instruction codes (voice coding procedures) used in the DSP 4. A digital modulation/demodulation portion 6 modulates the coded digital signal to form the modulated digital signal and demodulates the modulated signal to form the coded digital signal.\nAn analog voice signal processing portion 12 modulates the voice signal to form the modulated voice signal and demodulates the modulated voice signal to form the voice signal. An FM modulation/demodulation portion 10 modulates the modulated voice signal to form the FM signal and demodulates the FM signal to form the modulated voice signal. A radio frequency transmitter/receiver 7 amplifies the signals which are received from the modulation/demodulation portion 6 or FM modulation/demodulation portion 10 and sends them to an antenna 8, and receives radio frequency signal from the antenna 8 and sends them to digital modulation/demodulation portion 6 or the FM modulation/demodulation portion 10. A speaker 2 converts the voice signal which is received from linear codec 3 or analog voice signal processing portion 12 to the voice and outputs the voice. A control portion 11 controls the devices in the radio communication apparatus.\nThe operation of the above conventional art is explained hereinafter.\n(1"} +{"output_text": " of JAB1 protein in a breast cancer cell line resulted in reduced levels of p27 protein in the cell line (Yang, et al., Journal of Biological Chemistry 275:24735-24739 (2000)).\nMethods of using the JAB1 gene or protein in breast cancer diagnostics and therapeutic agents for breast cancer have been proposed (see, e.g., U.S. Pat. No. 6,251,601 and JAB1 antibody, available from Genentech, San", "input_text": " of p27 protein play a key role in regulating the level of p27 protein in cells.\nHER-2 protein is a protein that is believed to be involved in the degradation of p27 protein. An inverse relationship between the amount of HER-2 protein and the amount of p27 protein has been found in primary breast tumor samples. Overexpression of HER-2 protein in a breast cancer cell line resulted in reduced levels of p27 protein in the cell line (Yang, et al., Journal of Biological Chemistry 275:24735-24739 (2000)).\nMethods of using the HER-2 gene or protein in breast cancer diagnostics and therapeutic agents for breast cancer have been proposed (see, e.g., U.S. Pat. No. 6,251,601 and Herceptin\u00ae antibody, available from Genentech, San Francisco, Calif.). However, overexpression of HER-2 protein via gene amplification of the HER-2 gene has been found to date in only approximately 25% of breast cancer patients. Furthermore, one study has shown that less than half of the HER-2 overexpressing breast cancer patients in the study responded to HER-2 antibody-based treatment (Vogel, et al., Journal of Clinical Oncology 20: 719-726 (2002); Baselga J et al., Seminars in Oncology, Vol 26(4): Suppl. 12 pp 78-83, 1999; Slamon D. J., et al., The New England Journal of Medicine, Vol 344 pp 783-792, 2001; Vogel C. L et al., Journal of Clinical Oncology, Vol 20, pp 719-726, 2002).\nIn vitro studies of JAB1 protein (also referred to as CSN5 or p38JAB1) demonstrate that JAB1 protein contributes to the degradation of p27 protein, as overexpression"} +{"output_text": "9d\nMWD/LWD tools are typically lowered into a borehole on a wireline cable, which is a long, thin, flexible cable that is spooled out of a drum on the surface. The wireline cable is connected to a bottom hole assembly (BHA) at the end of the tool. The BHA typically includes a drill bit, a drill string, a drill collars, and other components that are used to drill the well. The BHA is lowered", "input_text": " near a drill hole, or xe2x80x9cborehole.xe2x80x9d A wealth of other information that is useful for oil well drilling and production is frequently derived from such measurements. Originally, a drill pipe and a drill bit were pulled from the borehole and then instruments were inserted into the hole in order to collect information about down hole conditions. This technique, or xe2x80x9cwireline logging,xe2x80x9d can be expensive in terms of both money and time. In addition, wireline data may be of poor quality and difficult to interpret due to deterioration of the region near the borehole after drilling. These factors lead to the development of Logging-While-Drilling (LWD). LWD operations involve collecting the same type of information as wireline logging without the need to pull the drilling apparatus from the borehole. Since the data are taken while drilling, the measurements are often more representative of virgin formation conditions because the near-borehole region often deteriorates over time after the well is drilled. For example, the drilling fluid often penetrates or invades the rock over time, making it more difficult to determine whether the fluids observed within the rock are naturally occurring or drilling induced. Data acquired while drilling are often used to aid the drilling process. For example, MWD/LWD data can help a driller navigate the well so that the borehole is ideally positioned within an oil bearing structure. The distinction between LWD and MWD is not always obvious, but MWD usually refers to measurements taken for the purpose of drilling the well (such as navigation) whereas LWD is principally for the purpose of estimating the fluid production from the earth formation. These terms will hereafter be used synonymously and referred to collectively as xe2x80x9cMWD/LWD.xe2x80x"} +{"output_text": " fiber is no longer useful and must be replaced.\nThe present invention is directed to a method and apparatus for detecting the presence of a flaw in a fiber optic cable. More particularly, the present invention is directed to a method and apparatus for detecting the presence of a flaw in a fiber optic cable by measuring the light reflected from the flaw.\nFiber optic cables are used in a wide variety of applications. For example, fiber optic cables are used in the telecommunications industry to transmit voice, video,", "input_text": ", this input face was polished to increase its strength. While polishing the input face of the fiber resulted in a noticeable reduction in the observed failure rate, the subject invention disclosed below was developed to essentially eliminate these failures.\nWhile studying the failure mechanism described above, the inventor herein discovered that the exposure of the input face of the fiber to the cycling of the pressurized steam occurring during sterilization in the autoclave enhanced and accelerated the formation of cracks in the silica glass at the fiber surface. It had been previously reported that water molecules in the presence of cracks in glass can accelerate the breakdown of bonds. (See, \"The Fracturing of Glass,\" Michalske and Bunker, Scientific American, December, 1987, pages 122-129.) As can be appreciated, the high pressures generated in an autoclave can force steam molecules into any microscopic cracks present in the fiber, accelerating the break down of atomic bonds. It is also believed that when the autoclave is rapidly depressurized, turbulence is created forcing debris into the input face of the fiber thereby increasing the damage. These cracks and other imperfections lead to breakdown of the fiber during use.\nWhen the fiber optic probe is used in a surgical procedure, the input coupler is connected to the operating laser source such that the output of the laser source is focused onto the input face of the fiber. Typically, the diameter of the focal spot on the input face is about one-half the size of the diameter of the fiber to insure that all the light energy from the laser is coupled into the fiber. Due to this concentrated focusing of the input light, the power density on the input face is quite high. If microcracks are present in the input face, a portion of the light is scattered, causing heating and then melting of the fiber which sharply decreases the amount of light being coupled into the fiber. At this point, the"} +{"output_text": " can be used to execute an application program, the application program must be written in a computer language that is understandable to the computer. The computer language is the set of rules that the computer uses to interpret the instructions of the application program. The computer language is also referred to as the \u201cmachine language\u201d of the computer. The computer language is typically a set of instructions that the computer can understand. The computer language is typically a set of instructions that the computer can understand. The computer language is typically a", "input_text": " undergo repeated surgery to re-open the treated vessel. Stents with surface microstructures are just under research investigation and no clinical data from human are known today. U.S. Pat. No. 6,190,404 B1 describes for the first time the usage of microgrooves on stent struts for faster healing after stent deployment by favorably modifying endothelial cell (EC) migration. However, these patterns are not optimized for EC migration, proliferation or adhesion, differentiation of circulating precursor cells under flow conditions, whereas it is known that wall shear stress in combination with certain micropattern can have profound effects on cellular behaviour during adhesion, differentiation, migration and proliferation. Moreover other structural design elements like pits and holes in different geometrical arrangements (square, hexagonal, disordered, different heights) were not mentioned in this application but may also play a pivotal role in the development of a functional EC layer (ECL). A similar approach is described in US 2005/0209684 A1. \nThe intention of this invention with respect to the case wherein the implant is a stent is therefore to provide a specific geometrical arrangement of surface structures in the sub-micron to micrometer regime for faster re-endothelialization of stent struts after drug eluting stent deployment via Percutaneous Coronary Intervention (PCI). This soft healing approach will significantly reduce the problem of late stent thrombosis (LST) and restenosis. Computers are currently used to execute a wide variety of application programs. Such application programs include, for example, design and manufacturing programs, spread sheet programs, word processing programs, programs to facilitate access to data bases, programs to create graphics, and the like. As the number and kinds of application programs continue to proliferate, as computers become easier to use, and as people become increasingly accustomed to using computers, the types of application programs will continue to grow.\nWhile a computer"} +{"output_text": " the treatment of pain (WO 94/01124, WO 94/04491, WO 94/04492, WO 94/04493, WO 94/04494, WO 94/04495, WO 94/04500, WO 94/04501, WO 94/04515, WO 94/04534, WO 94/04535, WO 94/04536, WO 94/04537, WO 94/04538, WO 94/04539, WO 94/04540", "input_text": ") 3564-9 (1988); A. Perianin, et al., Biochem. Biophys. Res Commun. 161, 520 (1989)!, post-operative pain and nausea C. Bountra, et al., Eur. J. Pharmacol., 249, R3-R4 (1993), F. D. Tattersall, et al., Neuropharmacology, 33, 259-260 (1994)!, vasodilation, bronchospasm, reflex or neuronal control of the viscera Mantyh et al., PNAS, 85, 3235-9 (1988)! and, possibly by arresting or slowing.beta.-amyloid-mediated neurodegenerative changes Yankner et al., Science, 250, 279-82 (1990)! in senile dementia of the Alzheimer type, Alzheimer's disease and Downs Syndrome. Substance P may also play a role in demyelinating diseases such as multiple sclerosis and amyotrophic lateral sclerosis J. Luber-Narod, et. al., poster C.I.N.P. XVIIIth Congress, 28th Jun.-2nd Jul., 1992!, and in disorders of bladder function such as bladder detrusor hyper-reflexia Lancet, 16th May 1992, 1239!. Antagonists selective for the neurokinin-1 (NK-1) and/or the neurokinin-2 (NK-2) receptor may be useful in the treatment of asthmatic disease (Frossard et al., Life Sci., 49, 1941-1953 (1991); Advenier, et al., Biochem. Biophys. Res. Comm., 184(3), 1418-1424 (1992); P. Barnes, et al., Trends Pharmacol. Sci., 11, 185-189 (1993)). Tachykinin antagonists may also be useful in"} +{"output_text": " adhered to the tool surface. In the case of a metallic tool, however, the cleaning work is difficult since the tool surface is damaged by the cleaning grindstone. In the case of a diamond or cBN sintered body, moreover, the tool surface is damaged by the cleaning grindstone and the tool is broken, thus resulting in a problem that the tool cannot be used for a long time.\nIn order to solve these problems, it has been considered effective to use a hard material consisting predominantly", "input_text": " the bonding material is gradually subject to damage by the repeated heating and pressing, thus resulting in a problem that it is difficult to maintain a uniformly bonded state for a long period of time. In such a metallic tool, however, a cleaning working to remove periodically the solder or oxide adhered to the pressing surface is required, during which damage by a cleaning grindstone changes the flatness of the tool pressing surface and unfavorably affects the bonded state. This is a large problem.\nIn order to solve these problems, it has been considered effective to use a hard material consisting predominantly of diamond or cBN (cubic boron nitride) excellent in heat resistance and wear resistance as a tool end material. However, single crystal diamond is has poor practical utility because of being limited in size. In the case of a diamond or cBN sintered body, moreover, there is a problem that it is difficult to work it into an end shape as shown in FIG. 10 and to maintain the flatness of the end surface at a high temperature within a precision range required for a long time since the property of the sintered body is affected by a metallic or non-metallic binder.\nIn any mounting method, the bonding tool must have an excellent heat resistance as its property, since it is constantly or intermittently allowed to be present under high temperature state. That is, it is required of the tool to directly press LSI without breakage of LSI and to maintain the surface roughness and flatness of the tool end surface under good state without thermal damage for a long time. However, the use of a metallic tool of an Invar alloy or Mo, etc. having hitherto been used up to the present time results in a problem that the property is gradually deteriorated.\nFurthermore, as referred to above, the tool surface should periodically be cleaned since the heated and sublimated solder or resin is solidified and"} +{"output_text": ", 1728-1732), ferrocenes grafted onto a polymer containing a redox group (Foulds, N. C. and Lowe, C. R. (1988) Anal. Chem. 60, 1728-1732), ferrocenes grafted onto a polymer containing a redox group and a polymer containing a redox group (Foulds, N. C. and Lowe, C. R. (1988) Anal. Chem. 60, 1728-1732),", "input_text": " thus plays the part of mediator since it permits the transfer of electrons. This transfer of electrons, which is proportional to the amount of glucose present in the solution to be tested, is then measured by the ammeter and the amount of glucose present in the solution is displayed by the display means of the measuring apparatus.\nAdditional research has shown that amperometric devices using non-physiological, organic, inorganic or organometallic mediators can supplant devices using oxygen as the mediator. Indeed, as shown in FIG. 2, devices using oxygen as the mediator cannot be used in solutions where the stoichiometric oxygen content is less than the concentration of the component to be measured. Otherwise, in this case, while the total amount of the component to be measured is able to react with the oxidized enzyme to form the reduced enzyme, only part of the total amount of the reduced enzyme can react with the oxygen present, in proportion to this amount of oxygen. The rest of the reduced enzyme is unable to react and the quantity of electrons transmitted to the conductor C is less than it should be.\nConsequently, when this type of device is used, one is either restricted by the respective concentrations of the oxygen and the component to be measured, or compelled to use a membrane to limit the diffusion of said component. This explains why attempts have been made to produce amperometric devices using a specific mediator to replace oxygen.\nVery many mediators have been proposed in the literature, such as monomeric ferrocenes (Cass, A. E. G. et al (1984), Anal. Chem. 56, 667-671; Degani, Y. and Heller, A. (1987), J. Phys. Chem. 91, 1285-1289), ferrocenes grafted onto a polymer (Foulds, N. C. and Lowe, C. R. (1988) Anal. Chem. 60"} +{"output_text": "Organic EL elements are generally classified into two types: a low-molecular-weight organic EL element and a high-molecular-weight organic EL element. The low-molecular-weight organic EL element is formed by laminating a hole transport layer, a light-emitting layer, and an electron transport layer on a transparent electrode, and the high-molecular-weight organic EL element is formed by laminating a hole transport layer, a light-emitting layer, and an electron transport layer on a transparent", "input_text": " yet extend down to where the sweep is being run.\nKnown cultivators rely on the weight of the cultivator unit (alone or with the weight of the tool bar added) to provide the force necessary to drive the sweeps into the ground. This is a problem when the ground consists of hard earth or the density of the soil is uneven (having randomly located hard and soft spots). The sweeps win often pop out of the ground when they hit a hard spot. Or they will drive deeply into the softer ground, going under the weed and grass plants, thereby missing them entirely.\nWhen cultivating fields planted with crops like corn, soybeans, or cotton, the cultivator is set to throw dirt onto the crop row itself at the base of the crop. This is done to cover tip the weeds and grasses that are growing in the crop row and which therefore cannot be cut without also cutting the crop plants. However, when working in fields planted with peanuts, for example, the cultivator must be set to keep from throwing dirt onto the crop row.\nWhen a known cultivator is run in a field treated with herbicide in a broadcast pattern, the operation of the cultivator punctures the herbicide blanket, resulting in a strip of untreated dirt. Tears in the herbicide blanket are not repaired by known cultivators, particularly where the lumps or clods of soil are not reduced to smaller size.\nAs will be described in detail below, the present invention overcomes the deficiencies of and problems associated with the conventional technology noted above. Over recent years, organic electroluminescent elements (thereinafter, also referred to simply as organic EL elements) employing organic materials have been regarded as promising in use as thin, inexpensive large-area full-color display elements of a solid light-emitting type and light source arrays, and therefore active research and development are being conducted.\n"} +{"output_text": "80x9d of the crown. The crown is therefore often compromised by the post and core system. The post and core system is generally bonded to the tooth with a cement. The cement is generally a two-part system, with a first part being a self-curing resin and a second part being a catalyst. The resin is generally a light-curable resin, and the catalyst is generally a peroxide. The resin and catalyst are mixed together and placed in a syringe. The syringe is", "input_text": " that of a natural tooth.\nRigid dental post and core systems are widely utilized to restore endodontically-treated teeth. Post and core restorations are routinely used to create an adequate foundation for the final restorative step, which may be a crown, inlay, or a fixed partial denture abutment. Generally, a post is provided for retention and lateral stability of the restoration. The core provides support for the crown. Two general types of post and core systems are known in the art: xe2x80x9cactivexe2x80x9d or screw-in type systems and xe2x80x9cpassivexe2x80x9d type systems. Active post and core systems mechanically engage the walls of the root canal and tooth dentin. Passive post and core systems are bonded in endodontically treated teeth utilizing cements and the like.\nTwo major problems are encountered when restoring an endodontically-treated tooth. Firstly, the tooth is more susceptible to fracture, and secondly, there is generally less coronal structure with which to work. The greater susceptibility of a tooth to fracture after endodontia may result from the tooth being more brittle. However, studies of the changing mechanical properties of pulpless teeth do not generally support this theory equating dryness with reduced mechanical strength. It appears that the greater susceptibility for fracture in an endodontically-treated tooth results from mechanical weakening of the tooth during root canal therapy and refinement of the root canal. Improvements in restoration techniques that reduce mechanical weakening are therefore desirous.\nAn endodontically-treated tooth is generally severely compromised either due to trauma or neglect. Thus, traumatic fractures, removal of old restorations and carious tissue, and preparation of root canal access may not leave enough tooth to maintain the xe2x80x9cdome effectxe2x"} +{"output_text": " effect of the losses on the output pulse.\nThe present invention provides a method and apparatus for reducing the sensitivity of a laser system to losses in the laser system. The invention is particularly useful in laser systems that use a partially reflective output coupler, such as a partially reflective output coupler that is partially transmissive.\nIn accordance with one aspect of the invention, a method of reducing the sensitivity of a laser system to losses in the laser system includes providing a laser system having a laser cavity,", "input_text": " Application Ser. No. JP11-025890, filed on Feb. 3, 1999, published on Aug. 11, 2000, Publication No. 2000223408, entitled SEMICONDUCTOR MANUFACTURING DEVICE, AND MANUFACTURING OF SEMICONDUCTOR DEVICE, disclosed a solid state seed laser and an injection locked power amplifier with a phase delay homogenizer, e.g., a grism or grism-like optic between the master oscillator and amplifier. U.S. Published application 20060171439, published on Aug. 3, 2006, entitled MASTER OSCILLATOR-POWER AMPLIFIER EXCIMER LASER SYSTEM, a divisional of an earlier published application 20040202220, discloses as master oscillator/power amplifier laser system with an optical delay path intermediate the master oscillator and power amplifier which creates extended pulses from the input pulses with overlapping daughter pulses.\nPartlo et al, Diffuser speckle model: application to multiple moving diffusers, discusses aspects of speckle reduction. U.S. Pat. No. 5,233,460, entitled METHOD AND MEANS FOR REDUCING SPECKLE IN COHERENT LASER PULSES, issued to Partlo et al. on Aug. 3, 1993 discusses misaligned optical delay paths for coherence busting on the output of gas discharge laser systems such as excimer laser systems.\nThe power efficiency of a regenerative amplifier, e.g., using a switching element, can be severely reduced by the effect of intracavity losses (particularly in the electro-optic switch). Also, the reflectivity of a partially reflective output coupler can affect both intracavity losses and the duration of the output pulse, etc. The sensitivity to such losses can be particularly high in cases with low gain, because this increases the"} +{"output_text": "NH\u2014C(\u2550O)\u2014R13; \u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014R13; \u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014R13; \u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014NH\u2014C(\u2550O)\u2014R13;", "input_text": "-membered, eight-membered, or nine-membered cycloaliphatic radical optionally containing at least one heteroatom as ring member, which can be condensed with a saturated or unsaturated, unsubstituted or at least monosubstituted monocyclic or polycyclic ring system and/or can be bonded via a linear or branched, unsubstituted or at least monosubstituted C1-6 alkylene group or two to six-membered heteroalkylene group; or for an unsubstituted or at least monosubstituted five-membered to fourteen-membered aryl radical or heteroaryl radical, which can be condensed with a saturated or unsaturated, unsubstituted or at least monosubstituted monocyclic or polycyclic ring system and/or can be bonded via a linear or branched, unsubstituted or at least monosubstituted C1-6 alkylene group or two to six-membered heteroalkylene group; R35, R36, and R37 each independently stand for H; F; Cl; Br; I; \u2014SFS; \u2014NO2; \u2014CF3; and \u2014CN; \u2014NH2; \u2014OH; \u2014SH; \u2014C(\u2550O)\u2014NH2; \u2014S(\u2550O)2\u2014NH2; \u2014C(\u2550O)\u2014NH\u2014OH; \u2014C(\u2550O)\u2014OH; \u2014C(\u2550O)\u2014H; and \u2014S(\u2550O)2\u2014OH; \u2014NHR13; \u2014NR14R15; \u2014NH\u2014C(\u2550O)\u2014R13; \u2014OR16, \u2014SR17; \u2014C(\u2550O)\u2014NHR18; \u2014C(\u2550O)\u2014NR19R20; \u2014S(\u2550O)2\u2014NHR21; and \u2014S(\u2550O)2\u2014NR22R23; \u2014C(\u2550O)\u2014"} +{"output_text": "\nIn addition, the printing plate should be able to withstand the high temperatures experienced during the printing process. Further, the printing plate should be able to withstand the high pressures experienced during the printing process.\nIn addition, the printing plate should be able to withstand the high shear forces experienced during the printing process. Further, the printing plate should be able to withstand the high temperatures and pressures experienced during the printing process.\nIn addition, the printing plate should be able to withstand the high shear forces and", "input_text": " There is still further a need for a system that provides users with updates associated with applications directly without the need to access the mobile operating system's digital distribution platform. In flexographic printing, also known as relief printing, ink is transferred from a pool of ink to a substrate by way of a printing plate. The surface of the plate is shaped so that the image to be printed appears in relief, in the same way that rubber stamps are cut so as to have the printed image appear in relief on the surface of the rubber. Typically, the plate is mounted on a cylinder, and the cylinder rotates at high speed such that the raised surface of the printing plate contacts a pool of ink, is slightly wetted by the ink, then exits the ink pool and contacts a substrate web, thereby transferring ink from the raised surface of the plate to the substrate to form a printed substrate.\nFlexographic printing competes with other forms of printing, e.g., lithography, gravure and letterpress printing. Those involved in the flexographic printing industry are constantly striving to improve the flexographic printing process in order to more effectively compete with other printing methods. One area which has received much attention from researchers is the development of improved plates for flexographic printing.\nThe demands placed on flexographic printing plates are great. For instance, a flexographic printing plate must have sufficient flexibility (a mechanical property) to wrap around a printing cylinder, yet be strong enough to withstand the rigors experienced during typical printing processes. Further, the printing plate should possess a low hardness or softness to facilitate ink transfer during printing.\nIt is also required that the printing plate have a relief image that has a chemical resistance against the aqueous-based ink or alcohol-based ink which is usually used in flexographic printing. It is further desired that the physical and printing properties of the printing plate are stable and do not change during printing or storage."} +{"output_text": " then releases the adsorbed fuel into the intake passage 77 of the engine 76 through the purge line 75. The fuel vapor is then combusted in the engine 76.\nThe fuel vapor treating device is tested by filling the fuel tank 71 with fuel and then operating the engine 76. The fuel vapor treating device is then tested by operating the engine 76 while the canister 73 is filled with fuel vapor. The fuel vapor treating device is then tested by operating the engine 76 while the canister 73 is filled with", "input_text": " a knowledge of the rate constants for leghemoglobin oxygenation and deoxygenation, an estimate of the free O.sub.2 concentration in the infected cells can be calculated. Many non-leguminous N.sub.2 fixing plants are also known to contain hemoglobins, and it should be possible to use the methodology described herein to measure the oxygen concentration in these nodules.\nHemoglobin oxygen saturation has been studied in mammalian systems for many years and there are several instruments, known as oximeters, available to measure non-invasively the proportion of hemoglobin oxygen saturation in blood. One such system is described in some detail in IEEE Trans. Biomed. Eng. 35: 185-197 (1988). However, these oximeters, which will be discussed in more detail hereinafter, are sensitive to ambient light, not designed for use with small nodules and therefore they are not suitable for agricultural field use. 1. FIELD OF THE INVENTION\nThe present invention relates generally to an apparatus for collecting and treating vaporized fuel in a fuel tank without releasing the fuel vapor into the atmosphere. More particularly, the present invention pertains to a testing apparatus for testing a fuel vapor treating device.\n2. DESCRIPTION OF THE RELATED ART\nA fuel vapor treating device, typically mounted on a vehicle, collects and treats vaporized fuel in a fuel tank without releasing the fuel vapor into the atmosphere. As shown in FIG. 14, a typical apparatus has a canister 73 that draws in and collects fuel vaporized in a fuel tank 71 through a vapor line 72. The canister 73 is filled with an adsorbent 74 comprised of activated carbon or the like. A purge line 75, extending from the canister 73, is connected to an intake passage 77 of an engine 76. The canister 73 first adsorbs the vaporized fuel drawn in through the vapor line 72. The canister 73"} +{"output_text": "3xe2x80x2, and exe2x80x2 are each 0 or 1, with the proviso that at least one of a1xe2x80x2 to exe2x80x2 is 1, and that when k is equal to 1, at least one of a1xe2x80x2 to d3xe2x80x2 is 1, and when k is equal to 0, at least one of a1xe2x", "input_text": " monovalent hydrocarbon group of 3 to 15 carbon atoms containing a xe2x80x94CO2xe2x80x94 partial structure. At least one of R010 to R013 is a monovalent hydrocarbon group of 2 to 15 carbon atoms containing a xe2x80x94CO2xe2x80x94 partial structure, while the remaining R\"\"s are independently hydrogen or straight, branched or cyclic alkyl groups of 1 to 15 carbon atoms. R010 to R013, taken together, may form a ring, and in that event, at least one of R010 to R013 is a divalent hydrocarbon group of 1 to 15 carbon atoms containing a xe2x80x94CO2xe2x80x94 partial structure, while the remaining R\"\"s are independently single bonds or straight, branched or cyclic alkylene groups of 1 to 15 carbon atoms. R014 is a polycyclic hydrocarbon group having 7 to 15 carbon atoms or an alkyl group containing a polycyclic hydrocarbon group. R015 is an acid labile group. R016 is hydrogen or methyl. R017 is a straight, branched or cyclic alkyl group of 1 to 8 carbon atoms. X is CH2 or an oxygen atom. Y is an oxygen atom or NR018 wherein R018 is a straight, branched or cyclic alkyl group of 1 to 6 carbon atoms. Letter k is equal to 0 or 1; a1xe2x80x2, a2xe2x80x2, a3xe2x80x2, b1xe2x80x2, b2xe2x80x2, b3xe2x80x2, c1xe2x80x2, c2xe2x80x2, c3xe2x80x2, d1xe2x80x2, d2xe2x80x2, d"} +{"output_text": " excavator body vibrates as a whole.\nIn the pipe laying apparatus of the combined rotation and vibration excavation type, the excavator body is provided with a rotary excavating tool and a vibrator, and the excavator body is rotated by the rotary excavating tool while the vibrator is vibrated by a vibrator drive unit. The excavator body is provided with a pipe laying head which is connected to the rotary excavating tool and a pipe laying arm which is connected to the vibr", "input_text": " that it is difficult to convey the excavated soil toward the starting pit for discharging thereof. Meanwhile, the pipe laying apparatus of the vibration excavation type offers the advantage that even if the earth contains solid materials such as gravel, they can be embedded in the earth by the vibration while allowing only the viscosity imparting liquid containing soil of high viscosity to be discharged to the starting pit. However, this type encounters the problem that when the earth layer is clay, it is difficult to improve operation efficiency and achieve a high excavation speed unless the amplitude of vibration of the excavator is increased, while if the amplitude is increased, the vibration of the ground surface increases in magnitude.\nProposals have been made, as described in Japanese Patent Application Laid-Open No. 44194/83, for example, to use a pipe laying apparatus which possesses the merits of both the rotation excavation type and the vibration excavation type by mounting a vibrator in the excavator body having a rotary excavating tool for allowing rotation and vibration excavations to be effected, so that the apparatus can cope with a wide range of earth layers including sand, clay, gravel, etc.\nMeanwhile, the pipe laying apparatus of the above-mentioned vibration excavation type also suffers the disadvantage that since the excavator body vibrates as a whole, the vibration of the excavator is transmitted to the pipes to be laid which are rigidly connected to the excavator body. Thus, the pipes to be laid are caused to vibrate simultaneously as the excavator body vibrates, and therefore it becomes necessary to increase the size of the excavator to increase the magnitude of the vibration produced.\nThe pipe laying apparatus of the combined rotation and vibration excavation type as disclosed in Japanese Patent Application Laid-Open No. 44194/83, noted hereinabove, has the same disadvantage as the pipe laying apparatus of the vibration excavation type since the"} +{"output_text": " it is desirable to conceal them as much as possible.\nThe invention aims to propose a motor vehicle windshield wiper that is easy to mount and remove, even by a person who is not experienced, and that is also easy to conceal.\nTo this end, the invention concerns more specifically a motor vehicle windshield wiper, of the type in which a wiper blade is articulated at the longitudinal front end of a wiper arm, around a transversal horizontal axis, via a connector that", "input_text": " invention concerns more specifically a motor vehicle windshield wiper, of the type in which a wiper blade is articulated at the longitudinal front end of a wiper arm, around a transversal horizontal axis, via a connector that is articulated on the blade.\nAccording to a known conception of the mounting articulated on a blade at the end of an arm, the connector is fit together elastically according to the radial direction on an articulation rod of the blade and the front end of the arm is longitudinally curved in order to form a hook in which the connector, once mounted on the blade, must be longitudinally inserted from the rear to the front.\nFor reasons of rigidity and compact size of articulation, but also for aesthetic questions, the connector is generally received between two lateral flanks of the blade that are linked via the articulation rod. In that way, the front end of the arm must also be received between the flanks of the arm, the front of the connector in relation to the blade in order to allow the insertion of the connector in the hook of the arm.\nSuch a mounting, if it presents reliability guarantees, reveals itself to be particularly delicate to perform and imposes that the blade presents, in the front of the articulation rod, an opening placed in an upper back of the blade in order to allow the insertion of the front end of the arm. Such an opening remains at least partially visible after the mounting of the blade on the arm.\nA conception has already been proposed in which the articulation means of the blade on the arm allows an easy mounting and removal of the blade, even by a person who is not experienced. In effect, the vehicle\"\"s owner is encouraged to regularly change their wiper blades and must be able to proceed through this operation in the simplest way possible.\nMoreover, the wipers being visible parts on the exterior of the vehicle,"} +{"output_text": " the waist, the wrist, the ankle, the ear, the forehead, the neck, the chest, the arm, the leg, the foot, the toe, the head, the chin, the jaw, the neck, the back, the shoulder, the elbow, the wrist, the hand, the finger, the toe, the foot, the ankle, the heel, the knee, the elbow, the shoulder, the hip, the knee, the ankle, the wrist, the elbow, the", "input_text": " other real-time inputs; and an electrical stimulation profile and a footprint conducive to long term wearability. In addition, prior art therapies which have some degree of flexibility include an electrode which must be tethered via cables to a control or power box. Prior art therapies which are wireless are typically bulky, inflexible, and not amenable to being worn for long periods of time.\nBecause successful weight loss is, in the end, a matter of achieving a high degree of compliance with a dietary regimen, it is absolutely critical for a successful device to go beyond mere appetite suppression and combine wearability, physical comfort, ease of use, and integration of numerous data sources to provide a holistic and real-time view into a person's dietary compliance, in addition to effectively modulating the individual's appetite, hunger, satiety level, satiation level, or fullness.\nTherefore, there is a need for a low profile, long lasting electrical neuro-stimulation device which is programmable, and is effective to cause appetite or hunger control, modulation or suppression while minimizing any accompanying nausea, dyspepsia and habituation. There is also a need for a device that can effectively integrate appetite management data with conventional weight management information, such as caloric expenditure and consumption.\nThere is a need for an electrical neuro-stimulation device which is wearable and can be controlled, programmed, and self-administered by the patient, thereby enabling greater patient independence. There is also a need for an electrical neuro-stimulation device which includes real-time or near real-time feedback from patient parameters including, but not limited to, exercise, diet, hunger, appetite, well-being and which will be able to obtain real-time or near real-time feedback from other wearable devices, for example, a device, with physiological sensors, configured to be worn on the human body, such as around"} +{"output_text": ".\nThe sample pre-treatment is performed by a method of heating a sample to a high temperature of about 1,000\u00b0 C. or more in a furnace, and then extracting carbon from the sample by a method of heating the sample to a high temperature of about 1,000\u00b0 C. or more in a furnace.\nHowever, in the case of the sample pre-treatment using a furnace, there is a problem that the sample is damaged by heat. Moreover, there is also a", "input_text": " acquired (the above-mentioned a-cut KTP crystal is about 62\u00b0 C.). Therefore, in case of obtaining SHG light using the conventional device, there was a problem that the output of SHG light did not become large efficiently. Moreover, there was also a problem that if the temperature of a crystal was not stabilized within about 1/100\u00b0 C. or less, stable SHG light output was not obtained.\n[Non-patenting reference 1] Page 1192 of the 50th meeting drafts, The Japan Society of Applied Physics and Related Societies A radiocarbon dating method which has been used to measure the age of remains having an archeological value means a radiocarbon dating method using a principle of collapsing in-vivo radiocarbon after the death of an organism at a constant ratio.\nThree kinds of carbon isotopes such as 12C, 13C, and 14C are mainly present in nature. Here, 12C occupies 98.89% of nature, 13C occupies 1.11% of nature, and a trace of 14C is present in nature. Meanwhile, even though carbon is absorbed into a body of an organism by photosynthesis or breathing, the ratio of 12C, 13C, and 14C keeps unchanged.\nHowever, after the organism is dead, 14C which is instable radiocarbon collapses at a constant rate and thus is changed to 14N. In this case, the organism suffers from a half-life in which an amount of 14 C is reduced half. The age of the organism may be estimated by the fact that the half-life is about 5,730 years.\nTo measure the age of a sample such as remains by an accelerator mass spectrometry which is one of the radiocarbon dating methods, there is a need to first extract carbon from the sample. This is referred to as a sample pre-treatment"} +{"output_text": " are dispersed in a solvent.\nThe dispersion is preferably a solution or suspension in which tin-doped indium oxide and/or antimony-doped tin oxide particles are dispersed in a solvent.\nThe solvent is preferably a polar solvent.\nThe polar solvent is preferably a polar solvent having a solubility parameter of not less than 7.0.\nThe polar solvent is preferably a polar solvent having a solubility parameter of not less than 7.0 and not more than 9.0.\nThe polar", "input_text": " glass and the interlayer film, thereby excellent penetration resistance is obtained.\nPreferable embodiment is an interlayer film wherein the number of tin-doped indium oxide and/or antimony-doped tin oxide with a particle diameter of not less than 100 xcexcm is one or less per 1 xcexcm2 of the interlayer film. That is, the embodiment in which the above-mentioned particles with the particle diameter of not less than 100 xcexcm may not be observed in the interlayer film, or even it can be observed, it is only the particle that is set at the center of 1 square micrometer flame and no other particle with the particle diameter of not less than 100 xcexcm can be seen within the flame, in the case of taking photographs and observing interlayer film by using transmission electron microscope.\nThe observation can be carried out by using transmission electron microscope, xe2x80x9cH-7100FA type transmission electron microscopexe2x80x9d produced by Hitachi Co., Ltd., and the photographs are taken at 100 kv acceleration voltage.\nAlso, preferred embodiment of the interlayer film of the present invention is an interlayer film for laminated glass, in which tin-doped indium oxide and/or antimony-doped tin oxide particles in dispersion has the average particle diameter of from 10 to 80 nm at room temperature, and still, 10 to 80 nm even after heating the dispersion up to 200xc2x0 C.\nThe interlayer film for laminated glass obtained by molding interlayer film out of said dispersion has the low haze and excellent transparency, wherein tin-doped indium oxide and/or antimony-doped tin oxide particles are dispersed in said film.\nSaid dispersion, mentioned later in detail, is a solution or suspension in which tin-doped indium oxide and/or antimony-doped tin oxide particles"} +{"output_text": "\nIn the case of the hidden terminal problem, the sending node \u201ca\u201d cannot detect the start of packet sending from the sending node \u201cb\u201d because of the electric wave propagation delay between sending nodes, and starts sending out the packet almost simultaneously with the sending node \u201cb\u201d.\nIn the case of the exposed terminal problem, the sending node \u201cb\u201d cannot detect the start of packet sending from the sending node \u201ca\u201d because of the electric wave propagation delay between sending nodes, and starts sending", "input_text": " the distance between sending nodes. This is because, in such a case, the carrier sense cannot be performed properly, and the sending node \u201cb\u201d sends out a packet in spite of the fact that the sending node \u201ca\u201d is sending another packet.\nNote that there is a propagation delay time problem as another cause of packet collisions. The propagation delay time problem means that, in spite of the fact that the carrier sense was performed properly, the sending node \u201cb\u201d cannot detect the start of packet sending from the sending node \u201ca\u201d because of the electric wave propagation delay between sending nodes, and starts sending out the packet almost simultaneously with the sending node \u201ca\u201d.\nMeanwhile, the exposed terminal problem is a problem that data is sent and received because of an unnecessary carrier sense, resulting in the reduction in system throughput.\nFor example, when the sending node \u201ca\u201d starts sending out a packet to a receiving node \u201ca\u201d, the sending node \u201cb\u201d detects the packet of the sending node \u201ca\u201d with a carrier sense, and determines that the situation of the medium is Busy. At this time, when there is a packet which the sending node \u201cb\u201d is about to send to the receiving node \u201cb\u201d, the sending node \u201cb\u201d is controlled to not send the packet but keep it.\nHowever, in the case described above, a situation can also be considered, where the receiving node \u201ca\u201d and the receiving node \u201cb\u201d are distanced enough such that packet collisions do not occur even if the sending node \u201cb\u201d sends a packet to the receiving node \u201cb\u201d. In such a situation, the carrier sense is unnecessary, reducing the system throughput.\nThe hidden terminal problem and the exposed terminal problem occur to predict packet collisions depending on the presence or absence of the packet sending in the sending node (verification of the situation of the medium with a carrier sense)."} +{"output_text": " of the photoelectric conversion section is inevitably limited to a manufacturing technique of a special type.\nIn addition, in the photoelectric conversion type information processing device using the C-Si wafer, the photoelectric conversion section is formed by a process of forming a p-n junction in the C-Si substrate. In this case, the p-n junction is formed by diffusing an impurity into the C-Si substrate. In this case, however, the impurity is diffused into the C-", "input_text": " at best in consideration of uniformity in the entire region of such wafer. On account of this, the light receiving surface of the photoelectric conversion element cannot be larger than the size of the C-Si substrate in the photoelectric conversion type information processing device using such C-Si wafer and including MOS type or CTD as its constituent element.\nAccordingly, when the information processing device having the photoelectric conversion section, the light receiving surface of which has such limited area, is used as the input device for the digital copier, for example, it is inevitably necessary that an optical system having a large image reduction ratio be interposed between an image original to be reproduced and the light receiving surface so that an optical image of the image original may be formed on the light receiving surface through the optical system. In this case, however, there exist technical restrictions against increase in image resolution to be described in the following.\nWhen an image original in A4 size is to be reproduced with the photoelectric conversion section having its image resolution of, for example, 10 lines/mm and a length of the light receiving surface in the longitudinal direction of 3 cm, the optical image of the image original to be focussed on the light receiving surface is reduced to about 1/6.9 with the consequence that the substantial image resolution of the photoelectric conversion section to the A4 size image original reduces to about 1.5 lines/mm. Thus, the substantial image resolution of the photoelectric conversion section lowers at a rate of (size of the light receiving surface)/(size of the image original) according as the size of the image original to be reproduced becomes larger.\nIn order, therefore, to solve this problem in this type of information processing device, there is required a manufacturing technique for increasing the image resolution of the photoelectric conversion section. However, for such high resolution to be obtained with such limited small area, the manufacture"} +{"output_text": " wide area network.\nFor each wireless communication device to participate in wireless communications, it includes a built-in radio transceiver (i.e., receiver and transmitter) or is coupled to an associated radio transceiver (e.g., a station for in-home and/or in-building wireless communication networks, RF modem, etc.). As is known, the receiver is coupled to an antenna and includes a low noise amplifier, one or more intermediate frequency stages, a filtering stage, and a", "input_text": "Communication systems are known to support wireless and wire lined communications between wireless and/or wire lined communication devices. Such communication systems range from national and/or international cellular telephone systems to the Internet to point-to-point in-home wireless networks. Each type of communication system is constructed, and hence operates, in accordance with one or more communication standards. For instance, wireless communication systems may operate in accordance with one or more standards including, but not limited to, IEEE 802.11, Bluetooth, advanced mobile phone services (AMPS), digital AMPS, global system for mobile communications (GSM), code division multiple access (CDMA), local multi-point distribution systems (LMDS), multi-channel-multi-point distribution systems (MMDS), and/or variations thereof.\nDepending on the type of wireless communication system, a wireless communication device, such as a cellular telephone, two-way radio, personal digital assistant (PDA), personal computer (PC), laptop computer, home entertainment equipment, et cetera communicates directly or indirectly with other wireless communication devices. For direct communications (also known as point-to-point communications), the participating wireless communication devices tune their receivers and transmitters to the same channel or channels (e.g., one of the plurality of radio frequency (RF) carriers of the wireless communication system) and communicate over that channel(s). For indirect wireless communications, each wireless communication device communicates directly with an associated base station (e.g., for cellular services) and/or an associated access point (e.g., for an in-home or in-building wireless network) via an assigned channel. To complete a communication connection between the wireless communication devices, the associated base stations and/or associated access points communicate with each other directly, via a system controller, via the public switch telephone network, via the Internet, and/or via some other"} +{"output_text": "AxO2+y compounds having xi and yi values within the range of 0.98xe2x89xa6xixe2x89xa61.02 and 0xe2x89xa6yixe2x89xa60.02.\nThe lithium cobalt oxides of the invention preferably have a position within the principal component space defined by the following relationship:\naxi+byixe2x89xa6c \nwherein xi={right arrow over (", "input_text": "60.05, xe2x88x920.02xe2x89xa6yxe2x89xa60.02 and A is one or more dopants. Preferably, 0.98xe2x89xa6wxe2x89xa61.02 and 0xe2x89xa6xxe2x89xa60.02.\nThe lithium cobalt oxides of the invention preferably have a position within the principal component space defined by the following relationship:\naxi+byixe2x89xa6c \nwherein xi={right arrow over (S)}ixe2x97xaf{right arrow over (P)}c1; yi={right arrow over (S)}ixe2x97xaf{right arrow over (P)}c2; the vector {right arrow over (S)}i is the x-ray spectrum for the LiwCo1xe2x88x92xAxO2+y compound; the vectors {right arrow over (P)}c1 and {right arrow over (P)}c2 are determined by measuring the x-ray powder diffraction values {right arrow over (S)}i between 15xc2x0 and 120xc2x0 using a 0.020 step size and CuKxcex1 rays for a large sample set of lithium cobalt oxides and using the regression of {right arrow over (S)}i of the sample set against the capacity fade after 50 cycles of a lithium coin cell that includes a lithium negative electrode and the lithium cobalt oxide as the positive electrode material and that is cycled between 3.0 and 4.3V at a constant current of C/3 during both charge and discharge cycles; and the values a, b and c are determined by using only the xi and yi values for LiwCo1xe2x88x92x"} +{"output_text": " junctions are monitored. The currents are controlled based on the amounts.\nGenerally, in another aspect, the invention features a method for bonding a number of first materials to a second material at different regions of the second material. The method includes placing each of the first materials in contact with the second material to form junctions between the first and second materials. A current is applied through the first and second materials to transfer charge to the junctions. The amount of charge transferred to each of the junctions is monitored. The", "input_text": " to the materials.\nGenerally, in another aspect, the invention features a system for bonding a number of first materials to a second material near different regions of the second material. The system includes an energy source and electrodes that are configured to establish charge flowpaths. The system also has controllers. Each different controller is associated with a different one of the flowpaths and is configured to cause charge to flow from the energy source through the associated flowpath until a predetermined amount of the charge flows through the associated flowpath.\nGenerally, in another aspect, the invention features a system for bonding glass spacer rods to a face plate of a flat panel display. The system includes an energy source, electrodes and controllers. The electrodes are configured to establish charge flowpaths. Each different flowpath is associated with a junction located between a different one of the glass spacer rods and the face plate. Each different controller is associated with a different one of the flowpaths and is configured to allow charge to flow from the energy source through the associated flowpath until a predetermined amount of the charge flows to the junction associated with the flowpath.\nGenerally, in another aspect, the invention features a method for anodically bonding two materials together. The method includes placing the two materials in contact with each other to form a junction between the materials. A current is applied through the materials to transfer charge to the junction. This current is monitored to determine the amount of the charge being transferred to the junction. The current is controlled based on the amount.\nGenerally, in another aspect, the invention features a method for bonding a number of first materials to a second material at different regions of the second material. The method includes placing each of the first materials in contact with the second material to form junctions between the first and second materials. Currents are applied through the first and second materials to transfer charge to the junctions. The amounts of charge transferred to each of the"} +{"output_text": " Union Carbide), Poly-G 76-635-1 (a polyether triol of molecular weight 265-1, available from Union Carbide), and Poly-G 76-635-2 (a polyether triol of molecular weight 265-2, available from Union Carbide).\nThe acid anhydride is a compound having the formula Rxe2x80x94COxe2x80x94Oxe2x80x94COxe2x80x94R,", "input_text": " undesirable by-products.\nAs defined herein, the term xe2x80x9cpolyolxe2x80x9d refers to compounds having between two and four free hydroxyl (xe2x80x94OH) groups per molecule, and preferably three hydroxyl groups. As defined herein, the phrase xe2x80x9clow molecular weight polyolxe2x80x9d refers to those polyols having a molecular weight less than 8,000, more preferably less than 2,000, and most preferably less than 500. The phrase xe2x80x9ccarboxyl-containing monomerxe2x80x9d refers to a polyol having a carboxyl group added to one of the hydroxyl groups of the polyol.\nAs indicated above, in one aspect, the present invention is directed to a carboxyl-containing monomer for use in preparing a polyurethane polymer. The carboxyl-containing monomer is the reaction product of a low molecular weight polyol compound and an acid anhydride, and the resulting carboxyl-containing monomer has a viscosity in the range of 3,000-100,000 centipoise (cps) and has oligomer content in the range of 2-30 mg KOH/g. Each of these components are discussed in more detail below.\nExamples of polyols that are useful in the present invention include low molecular weight polyols having from two to four hydroxyl groups. Preferably, the polyol contains three free hydroxyl groups (hereinafter termed xe2x80x9ctriolxe2x80x9d). Triols suitable for use in the present invention are generally based on the structure of glycerol, trimethylolpropane, trimethylolethane, and the like. Preferred triols include Poly-G 76-635 (a polyether triol of molecular weight 265, available from"} +{"output_text": ", 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71", "input_text": " may not provide any protection despite their high numbers in the infected host.\nIn the United States, approximately 40,000 newborns are congenitally infected with human cytomegalovirus (HCMV) annually. For many years, it has been recognized that human cytomegalovirus (HCMV) is efficiently transmitted to the fetus during pregnancy, with 0.5 to 2.5% of all newborns showing evidence of congenital infection. Unfortunately, the in utero infection is not benign, and 5 to 10% of the congenitally infected infants will be symptomatic at birth, with serious neurological defects. (for review, see (67)). Of the 5% to 10% that are symptomatic at birth, most develop sequelae such as microcephaly, sensorineural hearing loss, optic atrophy and chorioretinitis, and motor disabilities. Even the infected children who appear asymptomatic at birth are at high risk, as 10 to 15% of these children will show varying degrees of neurological damage later in life. The problem is intensified by the large increase in the number of young children in day care centers as the transmission rate of HCMV among children in these centers is high, and these children will frequently transmit the virus to their seronegative mothers or day care providers. The annual seroconversion rate for women with infected children is 30% as compared to a 3% rate for women with uninfected children. Moreover, immunization with the Towne strain of HCMV, which has been tested as a potential vaccine, did not significantly decrease the transmission rate. Since it is this group of women who most commonly will be pregnant, the risk to the newborn is significant. While the serological status of the mother positively correlates with protection of the newborn from disease, recent evidence strongly suggests that prior maternal immunity is not completely protective against neonatal disease from recurrent infection or infection with a different HCMV strain (4, 21"} +{"output_text": " the electronic commerce, the user accesses a Web site of a company through the Internet to do shopping. When a physical store is newly open, people in the neighborhood notice it. In these days of a large number of Web pages in the world reaching one billion, even if a Web site is established, the possibility that no one notices it and nothing is sold is high. On the other hand, when the user uses the Web sites, there are no restrictions from a geometrical and time viewpoints.", "input_text": " and 2. The base pillars are 200 \u03bcm in length and 50 \u03bcm in diameter. Hierarchical nanofiber-based structures have been developed using polymethylmethacrylate by chronologically employing two porous alumina templates. The adhesion strength in this hierarchical structure is also severely low. The heavily packed pillars cause clumping in the structure, which is suggested to lead to this deteriorated adhesion. None of the prior art made use of fiber spinning methodologies to form the adhesives.\nHigh aspect ratio (AR) structures exhibit significant shear adhesion strength compared to ones with low AR's. Various techniques have been employed to fabricate moderate AR structures including nanomolding, e-beam lithography, and replication of nanoporous membranes with polymers. The methodologies are costly to be scaled up for mass production.\nThus, there is a need in the art for improved methods of forming dry adhesives from electrospinning spinnable materials. There is also a need in the art for dry adhesives with high shear adhesion strength and low normal detachment. The present invention relates to a logical volume administration method and, more particularly, a method of administrating a logical unit in a disk memory apparatus.\nIn recent years, electronic commerce has been being rapidly spread in association with the penetration of the Internet into the market. In the electronic commerce, the user accesses a Web site of a company through the Internet to do shopping. When a physical store is newly open, people in the neighborhood notice it. In these days of a large number of Web pages in the world reaching one billion, even if a Web site is established, the possibility that no one notices it and nothing is sold is high. On the other hand, when the user uses the Web sites, there are no restrictions from a geometrical and time viewpoints. Companies can regard the people all over the world as potential customers, so that there is a big business chance. In"} +{"output_text": " of the cam surface is accelerated and the durability of the cam is reduced.\nFurther, in the case of a folding mechanism having a plurality of fixed cams and movable cams in symmetrical positions 180\u00b0 apart from each other, the cam surface of each cam tends to be worn out by abutment with the movable shaft. As a result, the cam surface is worn out and the durability of the cam is reduced.\nFurther, in the case of a folding mechanism having a plurality of fixed", "input_text": " same state as is brought about when movable housing 13 is opened to around 180\u00b0 from its closed state as described above. Thus, as shown in FIG. 10C, the state in which the tip ends of fixed cams 2A and 2B are in resilient contact with the flat portions on the left-hand side of movable cams 4A and 4B and movable housing 13 is folded down is brought about.\nHowever, in the described state, substantially no force in the opening direction is applied to movable member 3 as described above, and hence, substantially no force in the closing direction is applied to movable housing 13.\nThus, angle of opening over 180\u00b0 cannot be attained by the above described conventional folding mechanism. Further, substantially no force in the closing direction is applied to movable housing 13 in the state of movable housing 13 folded down with display portion 13A face up, i.e., in the state of folding mechanism 10 opened approximately 180\u00b0 from its closed position.\nAs a result, there arises a problem in an electronic apparatus employing such conventional folding mechanism 10 that a gap or play tends to be produced between movable housing 13 and fixed housing 12 and it becomes difficult to hold the opened and closed positions securely.\nOn the other hand, if such a configuration is made that has a single fixed cam and a single movable cam to come into resilient contact with each other, instead of such that has a plurality of fixed cams and movable cams in symmetrical positions 180\u00b0 apart from each other to come into resilient contact as described above, a folding mechanism capable of attaining an angle of opening wider than 180\u00b0 can be realized.\nIn such case, however, since resilient contact between cams is made only in one position, each cam tends to incline from its right position and the inclined cam tends to come into abutment with the movable shaft. As a result, wear and tear"} +{"output_text": " can be visualized. In other applications, the external profile of a subject can be used to help in the diagnosis of the subject. For example, in medical applications, the external profile of a subject can be used to help in the diagnosis of a patient's condition. For example, in medical applications, the external profile of a patient can be used to help in the diagnosis of a patient's condition. For example, in medical applications, the external profile of a patient can be used to help in the", "input_text": " Conebeam Computed Tomography (CT) scanners, or other types of scanners can be used to produce volumetric data of a scanned object such as part of a patient's body. From this volumetric data, a volumetric image of the scanned object can be produced. The volumetric image might be a three-dimensional object representation using volumetric rendering technology such as ray tracing or surface rendering after converting the volumetric data into an iso-surface model. As used herein, \u201csubject\u201d is used in context to refer to that physically present object, patient, patient body part, thing, etc. that is observed, scanned, etc. by an input device such as a scanner.\nImages rendered based on the volumetric data of a subject can be adjusted to present different views of the subject. For example, by changing the threshold of opacity and transparency control, the internal shape or external shape and/or structure of the subject can be visualized. In medical applications, the image threshold call be adjusted to visualize the hard tissue or soft tissue, i.e., bones or muscles and flesh. As the volumetric image is adjusted to visualize the external shape of the subject, the outcome can be very similar to that provided by a three-dimensional surface scanner. That is, when there is sufficient contrast between the external volume density and the surrounding air or other containing background material the surface of the subject can be very clearly visualized. For example, in medical applications, the three-dimensional facial profile of the patient can be clearly created and visualized.\nEven though the goal of the volumetric scan of a subject is not necessarily to visualize the external profile of a subject, the external profile may be helpful if it is shown along with the internal structure. For example, in medical applications like plastic surgery, orthodontics, oral and maxillofacial surgery, the facial profile along with the underlying skull structure"} +{"output_text": ",743,769; 8,743,771; 8,743,772; 8,743,773; 8,743,774; 8,743,775; 8,743,776; 8,743,777; 8,743,778; 8,743,779; 8,743,780; 8,743,781; 8,743,782; 8,743,783; 8,", "input_text": ",559,442; 8,560,274; 8,571,046; 8,571,518; 8,577,391; 8,578,015; 8,578,054; 8,583,671; 8,583,978; 8,587,427; 8,588,108; 8,593,419; 8,593,986; 8,595,359; 8,600,830; 8,612,583; 8,615,257; 8,619,576; 8,619,789; 8,620,772; 8,620,784; 8,621,577; 8,622,837; 8,624,771; 8,625,515; 8,626,344; 8,630,177; 8,630,291; 8,630,314; 8,631,101; 8,636,395; 8,638,667; 8,638,763; 8,652,038; 8,654,627; 8,654,649; 8,665,890; 8,667,084; 8,670,416; 8,675,678; 8,682,982; 8,693,322; 8,699,333; 8,699,368; 8,699,377; 8,700,301; 8,700,302; 8,700,536; 8,707,785; 8,712,711; 8,715,072; 8,718,055; 8,719,563; 8,725,274; 8,727,978; 8,730,047; 8,730,875; 8,732,454; 8,738,944; 8,743,750; 8,743,768; 8"} +{"output_text": " sought. Preferred F.sub.1 hybrids are more vigorous than their inbred parents. This hybrid vigor, or heterosis, can be manifested in many polygenic traits, including increased vegetative growth and increased yield.\nThe development of a maize hybrid involves three steps: (1) the selection of plants from various germplasm pools for initial breeding crosses; (2) the selfing of the selected plants from the breeding crosses for several generations to produce a series of inbred lines", "input_text": ".\nRecurrent selection breeding, backcrossing for example, can be used to improve inbred lines and a hybrid which is made using those inbreds. Backcrossing can be used to transfer a specific desirable trait from one inbred or source to an inbred that lacks that trait. This can be accomplished, for example, by first crossing a superior inbred (recurrent parent) to a donor inbred (non-recurrent parent), that carries the appropriate gene(s) for the trait in question. The progeny of this cross is then mated back to the superior recurrent parent followed by selection in the resultant progeny for the desired trait to be transferred from the non-recurrent parent. After five or more backcross generations with selection for the desired trait and for the germplasm inherited from the recurrent parent, the progeny will be homozygous for loci controlling the characteristic being transferred, but will be like the superior parent for essentially all other genes. The last backcross generation is then selfed to give pure breeding progeny for the gene(s) being transferred. A hybrid developed from inbreds containing the transferred gene(s) is essentially the same as a hybrid developed from the same inbreds without the transferred gene(s).\nElite inbred lines, that is, pure breeding, homozygous inbred lines, can also be used as starting materials for breeding or source populations from which to develop other inbred lines. These inbred lines derived from elite inbred lines can be developed using the pedigree breeding and recurrent selection breeding methods described earlier.\nDevelopment of Maize Hybrids\nA single cross maize hybrid results from the cross of two inbred lines, each of which has a genotype that complements the genotype of the other. The hybrid progeny of the first generation is designated F.sub.1. In the development of commercial hybrids only the F.sub.1 hybrid plants are"} +{"output_text": ".\nThe present invention provides a method for treating fungal infections in a human or animal subject, comprising administering to the subject a therapeutically effective amount of a composition comprising a therapeutically effective amount of a compound of formula I, a therapeutically effective amount of a compound of formula II, and a therapeutically effective amount of a penetration enhancer.\nThe present invention also provides a method for treating fungal infections in a human or animal subject, comprising administering to the subject a therapeutically", "input_text": "ages have been used to hold drug reservoirs in place in an attempt to enhance absorption of the pharmaceutical agent. However the bandages are thick, awkward, troublesome and generally lead to poor patient compliance.\nHydrophilic and hydrophobic film forming topical antifungal solutions have also been developed. These dosage forms provide improved contact between the drug and the nail, but the films are not occlusive. Topical formulations for fungal infection treatment have largely tried to deliver the drug to the target site (an infected nail bed) by diffusion across or through the nail.\nNail is more like hair than stratum corneum with respect to chemical composition and permeability. Nitrogen is the major component of the nail attesting to the nail's proteinaceous nature. The total lipid content of mature nail is 0.1-1.0%, while the stratum corneum lipid is about 10% w/w. The nail is 100-200 times thicker than the stratum corneum and has a very high affinity and capacity for binding and retaining antifungal drugs. Consequently little if any drug penetrates through the nail to reach the target site. Because of these reasons topical therapy for fungal infections have generally been ineffective.\nCompounds known as penetration or permeation enhancers are well known in the art to produce an increase in the permeability of skin or other body membranes to a pharmacologically active agent. The increased permeability allows an increase in the rate at which the drug permeates through the skin and enters the blood stream. Penetration enhancers have been successful in overcoming the impermeability of pharmaceutical agents through the skin. However, the thin stratum corneum layer of the skin, which is about 10 to 15 cells thick and is formed naturally by cells migrating toward the skin surface from the basal layer, has been easier to penetrate than nails. Moreover, known penetration enhancers have not proven to be useful in facilitating drug migration through the nail tissue"} +{"output_text": " are difficult to administer in a sustained manner.\nThe use of gene therapy to treat Parkinson's disease has also been proposed. For example, the use of adeno-associated virus (AAV) vectors to deliver a gene encoding GDNF to the brain has been proposed. However, the use of AAV vectors is problematic, since the AAV vectors are unable to cross the blood-brain barrier and are unable to infect neurons.\nThe use of gene therapy to treat Parkinson's disease has also", "input_text": "atal degeneration, postpone the onset of illness, or that substantively slow disability (Shoulson, supra).\nOther methods for the treatment of Parkinson's disease involve neurosurgical intervention. The thalamic outputs of the basal ganglia are an effective lesion target for the control of tremor (i.e., thalamotomy). Despite the development of modem imaging and surgical techniques to improve the effectiveness of these neurosurgical interventions for the treatment of Parkinson's disease tremor symptoms, the use of neurosurgical therapies is not widely applicable. For example, thalamotomy does not alleviate the akinetic symptoms which are the major functional disability for many people suffering from Parkinson's disease (Marsden et al., Adv. Neurol., 74:143-147 [1997]).\nTherapeutic methods aimed at controlling suspected causative factors associated with Parkinson's disease (e.g., therapies which control oxidative stress and excitotoxicity) have also been developed. Clinical trials have shown that administration of antioxidative agents vitamin E and deprenyl provided little or no neuroprotective function (Shoulson et al., Ann. Neurol., 43:318-325 [1998]). Glutamate-receptor blockers and neuronal nitric oxide synthase (NOS) inhibitors have been proposed as therapies for Parkinson's disease, however, no experimental results from human studies have yet been published (Rodriguez, Ann. Neurol., 44:S175-S188 [1998]).\nThe use of neurotrophic factors to stimulate neuronal repair, survival and growth in Parkinson's disease has also been studied, particularly the use of glial cell line-derived neurotrophic factor (GDNF). Although GDNF protein protects some dopamine neurons from death, it is difficult to supply GDNF protein to the brain. Furthermore, the use of such protein therapies in general is problematic, since protein molecules show rapid in vivo degradation, are unable to penetrate the blood-brain barrier and"} +{"output_text": "S. construction projects. For example, when forming a wall, the panels are typically extended by inserting wood slats between the panels. Similarly, when forming a column, the panels are typically extended by inserting wood slats between the panels.\nThe use of wood slats is problematic for several reasons. First, wood slats are typically made of wood. As such, they are subject to rotting and warping. Second, wood slats are typically not as strong as the panels they are", "input_text": " copies of media contents. Moreover, this secure copy-protection system should have chipset level-2 security performance so that even if an adversary opens the hardware chipset and probes its internal bus, the adversary still cannot obtain the critical secrets. 1. Field of the Invention\nThe present invention relates in general to the field of building construction. More particularly, the present invention relates to building construction form work structures. Specifically, a preferred embodiment of the present invention relates to outside conversion corner piece for joining form work panels.\n2. Discussion of the Related Art\nHistorically, builders have used form work panels to form walls and columns. For example when forming a wall, concrete is poured between two opposing panels of form work and over vertically projecting re-bar. After the concrete cures, the panels are removed to leave a free-standing wall. Similarly, when forming a column concrete is poured over inside pairs of opposing panels of form work and vertically projecting re-bar. When the concrete cures, the panels are removed to leave a free-standing column.\nSome form work panels are imported from abroad. These panels are often made according to the exporting country\"\"s measurement system. For example, it is nearly impossible to use panels imported from Europe on construction projects in the U.S. or other home country. This is because imported panels are typically created to conform with metric units. Metric units do not translate well in the world of U.S. building construction because contractors are typically not as familiar with such measurements and equipment. Moreover, building codes and blueprint specifications are not easily tailored to metric units to meet the builders\"\" needs.\nAs is known to those skilled in the art, wood slats or other xe2x80x9cfillersxe2x80x9d must often be used to extend the dimensions of the panels so that they can be used in U."} +{"output_text": " the tone signal.\nThe band pass filter 103 extracts the frequency component of 2130 Hz from the dual tone signal. The band pass filter 104 extracts the frequency component of 2750 Hz from the dual tone signal. The tone detection circuit 105 detects whether or not the component of the tone signal is present. The guard time setting circuit 106 determines whether or not the component of the tone signal is continuously detected for a predetermined guard time that is shorter than the time period of the tone signal.\nThe band", "input_text": " telephone of the first user so as to mute the receiving speech signal. Responding to the CAS signal, the telephone of the first user sends a DTMF signal that represents \"D\" so as to receive information from the third user. When the telephone exchange has received the \"D\" signal of the DTMF signal from the telephone terminal of the first user, the telephone exchange sends data of the telephone number and the name of the third user as FSK (Frequency Shift Keying) modulated data. The telephone terminal of the first user demodulates the FSK-modulated data, decodes the data of the telephone number and the name of the third user, and displays the decoded data on the display of the telephone terminal.\nIn the new type caller ID service corresponding to the call-waiting signal, before the telephone exchange sends the telephone number and the name of the third user, it sends the CAS signal that causes the telephone terminal of the first user to mute the receiving speech signal. The CAS signal is a dual tone signal with frequencies of 2130 Hz and 2750 Hz. The CAS signal lasts for 80 msec. The telephone terminal that accomplishes the new type caller ID service corresponding to the call-waiting signal has a signal detection circuit that detects the dual tone signal with frequencies 2130 Hz and 2750 Hz.\nGenerally, as shown in FIG. 1, the signal detection circuit that detects such a dual tone signal comprises a frequency band limiting filter 102, a band pass filter 103 that extracts a frequency component of 2130 Hz, a band pass filter 104 that extracts a frequency component of 2750 Hz, a tone detection circuit 105 that detects whether or not the component of the tone signal is present, and a guard time setting circuit 106 that determines whether or not a component of the tone signal is continuously detected for a predetermined guard time that is shorter than the time period of"} +{"output_text": "cidyl methacrylate (JOC 354 (1986) 211-217) ______________________________________\nThe prior art also includes the following references which describe the use of porous microspheres in chromatography:\n______________________________________ U.S. Pat. No. 3,975,350 U.S. Pat. No. 3,975,351 U.S. Pat. No. 3,975,352 U.S. Pat. No. 3,975,353 U.", "input_text": ". They report this material to be effective for the reverse phase chromatography of proteins using flow rates in the range of 0.5 to 1.5 mL per minute.\nHirata et al. [Journal of Chromatography, 396, 115-120 (1987)] discuss the performance of Asahipak GS columns (hydrophilic gels of vinyl alcohol copolymer) when exposed to various organic solvents. In all cases the use of organic solvents affects either swelling or shrinking in the gels. This is a very undesirable property for supports for use in HPLC.\nHjerten et al. [Journal of Chromatography 396, 101-113 (1987)] disclose a chromatographic support based on agarose crosslinked with divinyl sulphone. This support has enhanced rigidity compared to standard agarose but still can only withstanding pumping pressures up to 580 psi.\nPorath [Journal of Chromatography 218, 241-259 36 (1981)] describes a process for preparing crosslinked agarose which has a higher rigidity than standard agarose. This process involves including particles that can be dissolved under conditions that do not disturb agarose (in a low agarose content gel). The gel is then contracted by washing with a suitable organic solvent followed by drying. The gel is then crosslinked in a solvent which does not re-swell the gel. The particles are then dissolved leaving a porous, crosslinked agarose. Porath does not discuss the pressures that these crosslinked agaroses can withstand without collapsing.\nMost of the prior art appears to each producing porous microspheres from vinyl monomers using emulsion or suspension polymerization techniques such as:\n______________________________________ Styrene - divinyl benzene Acrylonitrile-divinyl benzene (JOC 358 (1986) 129-136) Vinyl pyridine (JOC 354 (1986) 211-217) Vinyl alcohol (JOC 349 (1985) 323-329) Gly"} +{"output_text": " deviation of the CMP.\nIn addition, the test pattern 2 is formed to various shapes depending on monitoring processes. Therefore, it is difficult to measure a thickness of the test pattern 2. As a result, it is difficult to monitor a process deviation of the CMP.\nAccordingly, the present invention is directed to a method and apparatus for monitoring a polishing process.\nA method and apparatus for monitoring a polishing process are disclosed. In one embodiment, a method for monitoring a polishing process includes", "input_text": " a subsequent substrate. As a result, a process deviation of the CMP can be reduced.\nReferring to FIG. 1, main patterns 8 are disposed on a main region b of a semiconductor substrate 1 where circuits or devices will be formed. Test patterns 2 for monitoring a process state of the main region b are disposed at a predetermined region of the semiconductor substrate 1. Generally, the test patterns 2 are disposed at a monitoring region a, which is positioned on a scribe line between the main regions b. Each test pattern 2 is formed to various shapes depending on monitoring processes.\nReferring to FIGS. 2 and 3, a CMP target layer 10 is formed on an entire surface of the semiconductor substrate 1 where the test pattern 2 and the main pattern 8 are formed. Continuously, the CMP target layer 10 is polished to expose the main pattern 8 and the test pattern 2. Thus, an interconnection 12 is formed to fill a space between the main patterns 8. The main region b generally includes a region 8a of a high pattern density as well as a region 8b of a low pattern density. In addition, because the conventional test pattern 2 is formed for the purpose of measuring a thickness, the test pattern 2 is planarized unlike the main pattern 8. According to the CMP process, the polishing target layer 10 is polished by providing slurry having a high etch selectivity to the polishing target layer 10. Therefore, when the CMP process is performed, a polishing rate is dependant on a pattern density.\nAs illustrated in FIG. 4, after the CMP, an etched thickness of the test pattern 2 differs from that of the main pattern 8. In general, a thickness of the test pattern 2 is measured before and after the CMP in order to monitor the process. Thus, even though an etched thickness is measured to be appropriate, it is difficult to rely on a process"} +{"output_text": " due to the fact that the control software is not adapted to the specific screw-cylinder configuration, but rather to the specific screw-cylinder combination.\nThe object of the present invention is to provide a control device for a plastics processing machine which is capable of operating a screw-cylinder combination in a simple and reliable manner.\nThis object is achieved by a control device for a plastics processing machine having a screw-cylinder combination, the screw-cylinder combination", "input_text": " the greatest applicable injection pressure, and the greatest injection stroke depend for example on the selection of the configuration. This represents however only some of the variables that are determined by the structure of the screw, like e.g. the number and shape of the grooves and the length/diameter ratio, and by the properties of the used cylinder. Typical characteristics of a cylinder are for example the diameter, the surface finish, the number of external heating zones, i.e. the number of separately operable and heatable zones along the length extension of the cylinder, the heating or cooling capacity that can be introduced there, type and characteristics of the thermoelectric elements used as temperature sensors, to name only a few. In conformity with the screw to be used and the cylinder as well as the material to be processed, the possible areas of use of a screw cylinder combination is determined.\nAs all component-specific parameters must be considered when correctly operating a component by a control device of a plastics processing machine, the control programs running in the control device must be separately suited to each possible component configuration when shipped. This means for example in relation to the plasticizing unit which is used in many different screw-cylinder configurations a very complicated programming process for each delivered machine. This is also a major drawback in connection with possible changes of an existing screw-cylinder configuration, as can be encountered in the exceptional case during material change but in particular upon replacement of defective or worn screws by new ones. The continuous adjustment of the control software is also a source of errors as is any change of existing software.\nCompounding this problem is the fact that this special need for customization up to now did not enable the use of a uniform control software which would be applicable for all offered components which are combined for various customers in a machine according to a modular concept, and especially for different screw-cylinder configurations. This is"} +{"output_text": ", which is performed by a hearing test operator. The prescription is based on the audiometric characteristics of the hearing-impaired user, which may be, for example, the audiometric characteristics of the user's unaided ear, the audiometric characteristics of the user's hearing aid, or a combination of the two. The hearing aid includes a microphone, an amplifier, and a battery. The battery is used as a power source and is enclosed in a housing with a battery compartment. The battery compartment", "input_text": " posture during swinging procedures.\nU.S. Pat. No. 5,139,264 to Wootten discloses a golf swing training apparatus including a base, an upright support frame, rotary guide arm assembly at the top of the support frame establishing a reference axis of rotation at an inner arm portion and having an outer end flexibly coupled to the club head so that as the club is swung it is confined to a swing plane perpendicular to the reference axis of rotation. There is adjustment in frame height and angle of incline for the reference axis of rotation as well as adjustment in the drag. There is also a tensioning feature to dampen the inertia mass during the stroke. Unfortunately, this prior art reference also does not assist the user in training their muscles to memorize the desired movement of the swing and does not position the user's body in the correct stance to maximize swing accuracy.\nAccordingly, a need remains for a golf swing exercising and training apparatus in order to overcome the above-noted shortcomings. The present invention satisfies such a need by providing a device that is convenient and easy to use, is durable yet lightweight in design, is versatile in its applications, and provides golfers with much needed assistance in improving their golf swing. 1. Field of the Invention\nThe present invention relates to hearing aids. The invention, more specifically, relates to a soft custom ear mold for hearing aids. The invention further relates to a method of manufacturing a soft custom ear mold. The invention, in particular, relates to a tool for use in the method.\nIn the context of the present disclosure, a hearing aid should be understood as a small, battery-powered, microelectronic device designed to be worn behind or in the human ear by a hearing-impaired user. Prior to use, the hearing aid is adjusted by a hearing aid fitter according to a prescription. The prescription is based on a hearing test"} +{"output_text": " mounted to a side of the air bag case 11, and then mounting the air bag case 11 to the side plate 3 of the seat back frame 2.\nThe air bag case 11 is provided with a lid plate portion 12 which is formed with a gas-generating inflator mounting portion 13 for mounting the inflator 10. The lid plate portion 12 is formed with a gas-generating inflator mounting hole 14 which is formed with a gas-generating inflator mounting portion 13 for mounting", "input_text": ".\nWith the portable terminal according to the fifth related art, if the battery cover is provided with not only the image pickup section, but also any other device than the image pickup section, such as the flash section, even if the user wants to replace the image pickup section only, the user must replace the whole battery cover together with any other device such as the flash section and therefore the portable terminal according to the fifth related art involves a similar problem to that involved in the portable terminal according to the first related art with the image pickup section fixed to the housing.\nIn the portable terminal according to the fifth related art, if a battery cover having the image pickup section having different performance, etc., is attached to a specific main body, harmonization between the main body and the image pickup section is not guaranteed as in the second to fourth related arts. 1. Field of the Invention\nThis invention relates to a holder for mounting an air bag module to a side of a seat back, the air bag module including an air bag case and an air bag housed within the air bag case.\n2. Description of Related Art\nReferring now to FIG. 1, a conventional construction for mounting an air bag module will be discussed in order to facilitate understanding of the present invention.\nIn a conventional automotive vehicle seat provided with an air bag, an air bag module 1 which includes an air bag case and an air bag housed within the air bag case is provided at a side of a seat back B by incorporating the air bag module 1 in a cavity (not shown) formed in a side of a back pad, and mounting the air bag module 1 to a side plate 3 of a seat back frame 2 with a lid plate portion of the air bag case being exposed to an exterior.\nReferring to FIG. 2, the air bag module 1 is assembled by causing a gas-generating inflator 10 to be"} +{"output_text": "aphylactic reaction to dextran.\nThe present invention is directed to a method for the treatment of a patient suffering from a disease or condition which is responsive to the administration of a compound having a specific binding affinity for a receptor which is expressed on the surface of a cell. The method comprises administering to the patient a therapeutically effective amount of a compound having a specific binding affinity for a receptor which is expressed on the surface of a cell, wherein the compound is capable of binding to the receptor and", "input_text": " which indicated that 30 of these patients having previous tuboplasties had severe adhesions, one-third of which were judged to be inoperable.\nHigh molecular weight dextran either alone or in combination with dextrose has been used in the prevention of peritoneal adhesions subsequent to surgery. Dextran is clinically standardized to a low molecular weight of about 75,000 by partial hydrolysis and fractional precipitation of the high molecular weight particles which normally have molecular weights of up to 200,000. Dextran is a polymer of glucose which has a chain-like structure and is produced from sucrose by Leuconostoc bacteria. In articles appearing in Fertility and Sterility, volume 33, number 6, June 1980, pages 660-662, Holtz, Baker, and Tsai and volume 34, number 4, October 1980, pages 394-395, by Holtz and Baker, results are reported of the adhesion reducing effects of a 32% (aqueous) solution of dextran 70 containing 10% dextrose (sold under the trade name HYSKON by Pharmacia, of Piscataway, N.J.). Holtz et al postulate several mechanisms of action in the prevention of peritoneal adhesions utilizing HYSKON including a simple mechanical separation of adjacent surfaces, termed a hydroflotation effect.\nProject coordinator diZerega and several contributors have reported the results of a large study in an article entitled \"Reduction of Post-operative Pelvic Adhesions with Intraperitoneal 32% Dextran 70: A Prospective, Randomized Clinical Trial\" in Fertility and Sterility, volume 40, number 5, for November 1983, pages 612-619. The authors, at page 618, indicate that the use of Dextran intraperitoneally has limitations such as the reported tendency of HYSKON to support bacterial proliferation and concern over the an"} +{"output_text": " the diaphragm. The aorta is the main artery of the body that supplies blood to the body. The aorta is a tubular structure that arises from the left ventricle of the heart and passes through the thorax and abdomen to the bifurcation into the iliac arteries. The aorta is the body's main artery and is responsible for supplying blood to the body. The aorta is a tubular structure that arises from the left ventricle of the heart and passes through the thorax and abdomen to the bifurcation into the iliac arteries", "input_text": "\nThe invention also provides a ferroelectric liquid crystal (FLC) device comprising a xcfx84-V min FLC material, at least one switching electrode, for applying a switching pulse, and at least one data electrode, for applying a data pulse, wherein:\na) said FLC material is such that at voltages below Vmin, xcfx84100%rev/xcfx840%sw less than 50, where xcfx84100%rev is the minimum duration of a monopolar voltage pulse required to achieve 100% reverse switching of said FLC material, and xcfx840%sw is the duration of a monopolar voltage pulse at which forward switching of said FLC material begins; and\nb) said switching pulse is followed by a first pulse of opposite polarity to said switching pulse.\nAt voltages below Vmin, xcfx84100%rev/xcfx840%sw may be less than 30.\nThe invention also provides a light modulating device comprising a FLC device as described above, wherein said FLC material is in the form of a layer, said switching electrode is one of a plurality of such switching electrodes on one side of said layer, said data electrode is one of a plurality of such data electrodes on the other side of said layer, and a plurality of pixels are defined in said layer at the intersections of said switching and data electrodes. A. Field of the Invention\nThe present invention relates to blood vessel graft systems for repairing aneurysms, and more particularly to a catheter-based graft system for repairing aortic aneurysms by deploying a graft within a blood vessel via percutaneous entry into a femoral artery of a patient.\nB. Description of the Prior Art\nAn aortic aneurysm is a very common deteriorating disease typically manifested by weakening and expansion of the aorta vessel wall at a region between the aorto-renal junction and"} +{"output_text": " the stack of conductor strands. The braze alloy is then allowed to cool and solidify to form a brazed joint.\nThe brazed joints between the conductor strands and the cooling fluid box are typically made by induction heating. The braze alloy is melted and wicked into the voids or gaps between the conductor strands and the cooling fluid box. The braze alloy is then allowed to cool and solidify to form a brazed joint.\nThe brazed joints between the conductor strands and the", "input_text": " solid, conductor strands and a plurality of hollow conductor strands. These solid conductor strands and hollow conductor strands are arranged to form a bar. The rectangular conductor strands are generally arranged or stacked in columns or rows, with the hollow conductor strands interspaced among the solid conductor strands. The hollow conductor strands each have an internal duct for conducting coolant through the armature bar.\nEach armature bar extremity ends at a cooling fluid box which acts as a reservoir for the cooling fluid, and which links with other elements of the cooling circuit. A cooling fluid box can also be referred to as a \u201chydraulic clip\u201d, \u201cclip\u201d, \u201cheader\u201d, \u201cend fitting\u201d, \u201cwater box\u201d, or another variation of these terms. The connection between each bar and its associated cooling fluid box is intended to be impervious to prevent the cooling fluid from leaking between the outside and inside of the cooling fluid box since leaks can result in isolation defects and corrosion problems.\nTo make the junction between the armature bar end and the cooling fluid box impervious to cooling fluid leaks, the end of the armature bar is brazed to the cooling fluid box. At one open end, the cooling fluid box encloses the ends of the conductor strands of one end of the armature bar, and a braze alloy bonds the end of each conductor strand to the neighboring conductor strand(s) and/or to the neighboring surface(s) of the cooling fluid box. The brazed joints between the adjacent conductor strands, and the brazed joints between the conductor strands and the cooling fluid box should retain electrical integrity while providing a fluid-tight barrier.\nTo braze, the hollow and solid conductor strand ends are assembled in stacks and positioned within the cooling fluid box. Braze alloy is then melted and wicked into voids or gaps during induction heating. The braze alloy spreads, bridging from surface to surface to fill the gaps through"} +{"output_text": " as to whether an interrupt has occurred. If an interrupt has occurred, then flow proceeds to block 208. If an interrupt has not occurred, then flow proceeds to block 210.\nAt block 208, the cause of the interrupt is determined. The cause of the interrupt is determined by the interrupt controller 110. The cause of the interrupt is then provided to the microprocessor 102 via the cause register 108.\nAt block 210, the interrupt is handled by the program that is currently executing. The program that", "input_text": " core 104 for executing instructions retrieved from the memory 120. In addition, the core 104 produces a number of interrupts 106, including both software interrupts and hardware interrupts (e.g., timer overflow) that must be \u201chandled\u201d by the microprocessor 102, as will be further described below with reference to FIG. 2. The microprocessor 102 further includes a cause register 108 for indicating to the microprocessor 102 the cause or source of an interrupt.\nThe interrupt controller 110 is coupled to a number of external devices 118 via interrupt lines 116, and to other system interrupts 114. The interrupt controller 110, orders the interrupts 110 to provide them to the microprocessor 102 via interrupt lines 112. One skilled in the art will appreciate that early microprocessors 102 were provided with a preset number of interrupt lines 112 for use by system level designers. However, as the need for interrupts increased, rather than adding additional pins on the microprocessor, interrupt controllers 110 were provided to interface between the increased number of interrupts 114, 116, and the existing interrupt lines 112 on the microprocessor 102.\nThe microprocessor 102 is connected to the memory 120, to retrieve instructions for execution, as mentioned above, to retrieve information relating to interrupts, such as an interrupt vector table 122, and to retrieve the programs which handle the interrupts 124.\nReferring now to FIG. 2, a flow chart 200 is shown that illustrates prior art program flow when an interrupt occurs within the microprocessor 102. Operation of the program flow for handling interrupts will now be described with reference to both FIGS. 1 and 2.\nProgram execution begins at block 202 and proceeds to block 204.\nAt block 204, instructions are executed by the microprocessor 102 that are retrieved from memory 120. Flow then proceeds to decision block 206.\nAt decision block 206, a determination is made by the microprocessor 102"} +{"output_text": " the bed system in the same manner as normal paraffins.\nThe Asselin patent is also silent as to arrangements of multiple number of different sieves which ma be present in the absorption separation technique. In fact, in the drawing of Asselin, the adsorption bed system, 50, is comprised of calcium 5A zeolite in the form of 1/16-inch cylindrical pellets. Branched paraffins, whether they be mono- or di-branched, flow through the", "input_text": " in Holcombe, U.S. Pat. No. 4,210,771. This is a process for the virtual complete isomerization of normal paraffin hydrocarbons in a feed stream consisting essentially of mixed normal and branched hydrocarbons, where the feed stream is passed first through an isomerization reactor and the products derived therefrom are passed to an adsorption section which separates normal from branched paraffins to form an isomerate having both di- and mono-branched paraffins. A recycle stream comprising nearly pure normal paraffins is usually recycled to exhaustion. Other disclosures which may be commensurate with Holcombe comprise U.K. Pat. No. 876,730 and U.S. Pat. No. 3,755,144 issued to Asselin.\nThe zeolite molecular sieve employed in Gray et al and Holcombe may be selected from any adsorbent which selectively adsorbs normal paraffins based on the molecular pore size of the aluminosilicate. Particularly suitable zeolites of this type are calcium exchanged zeolite 5-A. Naturally occurring zeolite molecular sieves which could be substituted for calcium 5-A zeolite include chabazite and erionite. The particular flow scheme of adsorption as taught by Holcombe '771 is herein incorporated by reference to show an operable multiple zeolitic molecular sieve absorption means, to achieve proper adsorption-fill and desorption-purge. The Holcombe patent is completely silent as to arrangements of multiple number of different sieves which ma be present in the absorption separation technique. In fact, in the drawing of Holcombe, the adsorption bed systems, 44, 46, 48, and 50, are all comprised of calcium 5A zeolite in the form of 1/16-inch cylindrical pellets. Branched paraffins, whether they be mono- or di-branched, flow through"} +{"output_text": " peptide would be highly desirable.\nA number of methods have been developed for the site-specific attachment of PEG to proteins. For example, U.S. Pat. No. 5,932,462 (Geysen et al.) describes a method for site-specific attachment of PEG to proteins. The Geysen method involves the formation of a PEG-protein conjugate by reacting a heterobifunctional PEG with a protein that has been activated for reaction with PEG. The PEG-activ", "input_text": " with improved pharmacokinetic properties are produced by attaching synthetic polymers to the peptide backbone. An exemplary polymer that has been conjugated to peptides is poly(ethylene glycol) (\u201cPEG\u201d). The use of PEG to derivatize peptide therapeutics has been demonstrated to reduce the immunogenicity of the peptides. For example, U.S. Pat. No. 4,179,337 (Davis et al.) discloses non-immunogenic polypeptides such as enzymes and peptide hormones coupled to polyethylene glycol (PEG) or polypropylene glycol. In addition to reduced immunogenicity, the clearance time in circulation is prolonged due to the increased size of the PEG-conjugate of the polypeptides in question.\nThe principal mode of attachment of PEG, and its derivatives, to peptides is a non-specific covalent bonding through a peptide amino acid residue (see e.g., U.S. Pat. No. 4,088,538 U.S. Pat. No. 4,496,689, U.S. Pat. No. 4,414,147, U.S. Pat. No. 4,055,635, and PCT WO 87/00056). Another mode of attaching PEG to peptides is through the non-specific oxidation of glycosyl residues on a glycopeptide (see e.g., WO 94/05332), which is followed by the reductive amination of the resulting carbonyl moiety with an amino-PEG species.\nIn these non-specific methods, poly(ethylene glycol) is added in a random, non-specific manner to reactive residues on a peptide backbone. Random attachment of PEG molecules has drawbacks, including a lack of homogeneity of the final product, and the possibility for reduction in the biological or enzymatic activity of the peptide. Therefore, for the production of therapeutic peptides, a derivitization strategy that results in the formation of a specifically labeled, readily characterizable, essentially homogeneous PEGylated"} +{"output_text": " in effect, to xe2x80x9cfloatxe2x80x9d on the smaller contact.\nThe foregoing discussion has been directed to the situation where the contact-to-pad pressure is maintained during scrubbing by the elastomeric layer. However, it will be appreciated that the elastomeric layer will also exert a recovery force on the tilting beams 90 and thus on the contacts 93. This recovery force will be greatest when the contact-to-pad pressure is at its", "input_text": " this fashion, the insulative oxide buildup on each pad is removed so as to ensure adequate contact-to-pad electrical connections.\nFIG. 8 shows, in dashed line view, the relative positions of the contact 88 and pad 100 at the moment of initial engagement or touchdown and, in solid-line view, these same elements after xe2x80x9covertravelxe2x80x9d of the pad by a distance 106 in a vertical direction directly toward the flat support surface 70. As indicated, the distance 108 of lateral scrubbing movement is directly dependent on the vertical deflection of the contact 88 or, equivalently, on the overtravel distance 106 moved by the pad 100. Hence, since the overtravel distance for each contact on the central region 80a will be substantially the same (with differences arising from variations in contact height), the distance of lateral scrubbing movement by each contact on the central region will be substantially uniform and will not, in particular, be affected by the relative position of each contact on the central region.\nBecause the elastomeric layer 98 is backed by the incompressible support surface 70, the elastomeric layer exerts a recovery force on each tilting beam 90 and thus each contact 93 to maintain contact-to-pad pressure during scrubbing. At the same time, the elastomeric layer accommodates some height variations between the respective contacts. Thus, referring to FIG 9a, when a relatively shorter contact 88a is situated between an immediately adjacent pair of relatively taller contacts 88b and these taller contacts are brought into engagement with their respective pads, then, as indicated in FIG. 9b, deformation by the elastomeric layer allows the smaller contact to be brought into engagement with its pad after some further overtravel by the pads. It will be noted, in this example, that the tilting action of each contact is locally controlled, and the larger contacts are able,"} +{"output_text": " Example 10.\nThe antibodies, which specifically recognize human Delta-1 and human Serrate-1, can be prepared by the following method.\nThe fusion protein of FLAG and human IgGFc, which is expressed in COS-7 cell, is purified by the method described in Example 8. The fusion protein is then used as an antigen to immunize a mouse. The mouse is immunized with the antigen, and the spleen cells are collected. The spleen cells are fused with myeloma cells", "input_text": "-1 are expressed in COS-7 cell (obtainable from the Institute of Physical and Chemical Research, Cell Development Bank, RCB0539), and the transformants which were transformed by these expression plasmids, can be obtained. Further, human Delta-1 polypeptide and human Serrate-1 polypeptide can be produced by culturing the transformants under preferable culture condition in medium by known culture method.\nAs shown in Example 8, human Delta-1 polypeptide and human Serrate-1 polypeptide can be isolated and purified from the above cultured mass, in general, by the following methods.\nFor extraction of the substance from cultured microbial cells or cells, microbial cells or cells are collected by known method such as centrifugation after the cultivation, suspended in preferable buffer solution, disrupted the microbial cells or cells by means of ultrasonication, lysozyme and/or freeze-thawing and collected crude extract by centrifugation or filtration. The buffer solution may contain protein-denaturing agents such as urea and guanidine hydrochloride or surface active agents such as Triton-X. In case of secretion in the cultured solution, the cultured mass is separated by the known method such as centrifugation to separate from microbial cells or cells and the supernatant solution is collected.\nThe thus obtained human Delta-1 or human Serrate-1, which are contained in the cell extracts or cell supernatants, can be purified by known protein purification methods. During the purification process, for confirmation of existence of the protein, in case of the fused proteins of the above FLAG and human IgGFc, they can be detected by immunoassay using antibody against known antigen epitope and can be purified. In case of not to express as such the fused protein, the antibody in Example 9 can be used for detection.\nAntibodies, which specifically recognize human Delta-1 and human Serrate-1, can be prepared as shown in"} +{"output_text": " 1000 nucleotides in length are difficult to synthesize but can be generated by recombinant DNA techniques. Individuals skilled in the art will readily recognize that the nucleic acids, for use as capture ligands, can be provided with a label to facilitate detection of a hybridization product.\nNucleic acid isolated and synthesized in accordance with the sequence of the invention contained in the Sequence Listing can also be useful as capture ligands to detect homologous regions (especially homologous genes) of other Pseudomonas species using appropriate stringency hybridization conditions as", "input_text": "A nucleic acid isolated or synthesized in accordance with the sequence of the invention contained in the Sequence Listing can be used as a probe to specifically detect P. aeruginosa. With the sequence information set forth in the present application, sequences of twenty or more nucleotides are identified which provide the desired inclusivity and exclusivity with respect to P. aeruginosa, and extraneous nucleic acids likely to be encountered during hybridization conditions. More preferably, the sequence will comprise at least about twenty to thirty nucleotides to convey stability to the hybridization product formed between the probe and the intended target molecules.\nSequences larger than 1000 nucleotides in length are difficult to synthesize but can be generated by recombinant DNA techniques. Individuals skilled in the art will readily recognize that the nucleic acids, for use as probes, can be provided with a label to facilitate detection of a hybridization product.\nNucleic acid isolated and synthesized in accordance with the sequence of the invention contained in the Sequence Listing can also be useful as probes to detect homologous regions (especially homologous genes) of other Pseudomonas species using appropriate stringency hybridization conditions as described herein.\nCapture Ligand\nFor use as a capture ligand, the nucleic acid selected in the manner described above with respect to probes, can be readily associated with a support. The manner in which nucleic acid is associated with supports is well known. Nucleic acid having twenty or more nucleotides in a sequence of the invention contained in the Sequence Listing have utility to separate P. aeruginosa nucleic acid from one strain from the nucleic acid of other another strain as well as from other organisms. Nucleic acid having twenty or more nucleotides in a sequence of the invention contained in the Sequence Listing can also have utility to separate other Pseudomonas species from each other and from other organisms. Preferably, the sequence will comprise at least about twenty nucleotides to convey stability to the hybridization product formed between the probe and the intended target molecules. Sequences larger than"} +{"output_text": " the verification step, sense amplifier 19 is enabled to sense the threshold voltages of the cells of array 16. Sense amplifier 19 is enabled by control unit 29, which causes the output of sense amplifier 19 to be fed back to AND gate 22. The output of AND gate 22 is fed back to control unit 29, which, in turn, causes control unit 29 to generate a signal to enable sense amplifier 19. The signal to enable sense amplifier 19 is generated by AND gate 22, which is enabled by the", "input_text": " time during an erase operation. Each erase operation comprises a sequence of steps, including \"verification\" steps for verifying that the cells have desired threshold voltages at each of one or more stages of the erase operation.\nMore specifically, if cells of memory array 16 of FIG. 1 are to be erased, an \"Erase Setup\" command and then an \"Erase Confirm\" command are sent from an external device to I/O pad 30. Where each such command comprises parallel bits, the different bits are sent in parallel to I/O pad 30 and to additional I/O pads identical to I/O pad 30. The command is transferred from I/O pad 30 (or from I/O pad 30 and additional I/O pads) to input buffer 18 (or input buffer 18 and input buffers connected to the other I/O pads), and then to control unit 29. Control unit 29, which typically includes command execution logic and a state machine, processes the command to generate instruction data, and supplies the instruction data to circuit 14 and sense amplifier 19 (and to other components of memory chip 3 of FIG. 1) to cause chip 3 to execute a sequence of steps required for erasing the specified cells of array 16. These steps include verification steps (e.g., the verification step discussed below with reference to FIG. 5) for verifying that one or more of the cells have desired threshold voltages at each of one or more stages of the erase operation.\nDuring each verification step, verification data (denoted as \"RAW VERIFY OK\" in FIG. 1) is output from AND gate 22 (in response to the output of sense amplifier 19). This verification data can be fed back to control unit 29. Typically, an external device polls output pads of chip 3 in order to determine whether the erase operation has been completed and whether the erase operation was successful.\nMore specifically, during"} +{"output_text": " to say homologous or endogenous in relation to the cell which is to be transfected.\nThe expression gene of therapeutic interest may be a gene encoding a protein or a peptide, or a gene encoding a protein or a peptide which is not naturally expressed in the target cell.\nThe expression gene of therapeutic interest may be a gene encoding a protein or a peptide which is not naturally expressed in the target cell, but which is expressed in the target cell in the presence of a specific inducer.\nThe expression", "input_text": "DNA), messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), hybrid sequences such as DNA/RNA chimeroplasts or synthetic or semisynthetic sequences, and oligonucleotides which are modified or otherwise. These nucleic acids may be of human, animal, plant, bacterial or viral origin and the like. They may be obtained by any technique known to persons skilled in the art, and in particular by the screening of libraries, by chemical synthesis or by mixed methods including the chemical or enzymatic modification of sequences obtained by the screening of libraries. They may be chemically modified. In general, they contain at least 10, 20, 50 or 100 consecutive nucleotides, and preferably at least 200 consecutive nucleotides. More preferably still, they contain at least 500 consecutive nucleotides.\nAs regards more particularly deoxyribonucleic acids, they may be single- or double-stranded, as well as short oligonucleotides or longer sequences. In particular, the nucleic acids advantageously consist of plasmids, vectors, episomes, expression cassettes and the like. These deoxyribonucleic acids may carry a prokaryotic or eukaryotic replication origin which is functional or otherwise in the target cell, one or more marker genes, sequences for regulating transcription or replication, genes of therapeutic interest, anti-sense sequences which are modified or otherwise, regions for binding to other cellular components, and the like.\nPreferably, the nucleic acid comprises one or more genes of therapeutic interest under the control of regulatory sequences, for example one or more promoters and a transcriptional terminator which are active in the target cells.\nFor the purposes of the invention, the expression gene of therapeutic interest is understood to mean in particular any gene encoding a protein product having a therapeutic effect. The protein product thus encoded may in particular be a protein or a peptide. This protein product may be exogenous, homologous or endogenous in relation to the target cell, that is"} +{"output_text": " in the reference frame.\nThe motion predictor 78 also generates a motion vector for each macro block in the non-I frame for which no match is found in the reference frame. For example, if the motion predictor 78 determines that the macro block in the non-I frame is not a match for any macro block in the reference frame, the motion predictor 78 generates a motion vector that identifies the location of the non-I frame macro block with respect to the reference frame. The motion predictor 78 then", "input_text": " matching macro block can be found. The next two components, XR and YR, together comprise the two-dimensional location value that indicates where in the frame 0 the matching macro block can be found. Thus, in this example, because the location S of the frame 0 has the same X,Y coordinates as the location R in the frame 1, XR=YR=0. Conversely, the macro block in the location T matches the macro block in the location Z, which has different X,Y coordinates than the location T. Therefore, XZ and YZ represent the location T with respect to the location Z. For example, suppose that the location T is ten pixels to the left of (negative X direction) and seven pixels down from (negative Y direction) the location Z. Therefore, MVZ=(0, xe2x88x9210, xe2x88x927). Although there are many other motion-vector schemes available, they are all based on the same general concept.\nReferring again to FIG. 4, motion prediction is now discussed in detail. During the encoding of a non-I frame, a motion predictor 78 compares the pre-compression Y values (the CB and CR values are not used during motion prediction) of the macro blocks in the non-I frame to the decoded Y values of the respective macro blocks in the reference frame and identifies matching macro blocks. For each macro block in the non-I frame for which a match is found in the reference frame, the motion predictor 78 generates a motion vector that identifies the reference frame and the location of the matching macro block within the reference frame. Thus, as discussed below in conjunction with FIG. 6, during decoding of these motion-encoded macro blocks of the non-I frame, the decoder uses the motion vectors to obtain the pixel values of the motion-encoded macro blocks from the matching macro blocks"} +{"output_text": " etch-stop layer is deposited. However, the etch rate of the etch-stop layer is not limited to the etch rate of the lightly doped silicon. The etch rate of the etch-stop layer is also limited by the etch rate of the heavily doped silicon.\nThe etch rate of the etch-stop layer is also limited by the etch rate of the etch-stop layer. The etch rate of the etch-stop layer is limited by the etch rate of the etch-stop layer. The", "input_text": ", miniature fluid lines, pumps and valves, and in flow sensors. Another application with a potentially large commercial market is in fabricating silicon-on-insulator (SOI) substrates by the bond-and-etch-back silicon-on-insulator (BESOI) process.\nSilicon-based selective chemical etch-stop layers such as Si--B, Si--Ge, Si--Ge--B, Si--P, and Si--As have major problems and disadvantages which are overcome by the present invention. The disadvantages can be illustrated by examining examples pertaining to the commonly used Si--Ge--B etch-stops. First, the specially doped layer (e.g., Si--Ge--B) and the lightly doped silicon have limited selectivity. Selectivity is defined as the etch rate of lightly doped silicon divided by the etch rate of the etch-stop layer, or in some cases its reciprocal as discussed below. Limited selectivity increases the manufacturing cost by creating a need for tightly controlled, and sometimes labor-intensive processing to prevent the etch from going beyond the intended depth. This problem is exacerbated when fabricating the thin layers that are required for submicron electronic devices.\nCertain chemical solutions etch a lightly doped silicon layer more rapidly than a heavily doped layer. For this purpose lightly doped means less than approximately 1E17 dopant atoms per cm.sup.3, and heavily doped means more than approximately 1E19 dopant atoms per cm.sup.3. For example, 21 weight percent (wt %) potassium hydroxide in H.sub.2 O (KOH--H.sub.2 O) at about 70.degree. C. etches the (100) plane of lightly doped silicon rapidly (approximately 1 micrometer per minute), but the etch rate becomes slow (less than 0.01 micrometer per minute), making possible selective etching, as the"} +{"output_text": "/or their receptors in the circulation of patients with autoimmune disease, including AIDS, has been reported (Hess et al., Infection 19, Suppl 2:S93-97 (1991); Biglino et al., Infection 19 (1):11/7-11/7 (1991); Danis et al., Ann. Rheum. Disease 51(8):946 (1992)).\nThe inventors have now discovered that the presence of IFN\u03b1 in the circulation of patients with", "input_text": "k et al., In AIDS: The Epidemic of Karposi's Syndrome and Opportunistic Infections, A. E. Friedman-Kien & L. J. Laubenstein, eds. Masson Publishing, New York, N.Y., 1986; Hess et al., Infection 19, Suppl 2:S93-97 (1991); Biglino et al., Infection 19 (1):11/7-11/7 (1991)), and the decline seen in the serum levels of TNF-\u03b1 in RA patients following long term administration of the disease modifying drug sulfasalazine (Danis et al., Ann. Rheum. Disease 51(8):946 (1992)), further suggest that the concentrations of cytokines and/or their receptors is reflected in the clinical course of autoimmune disease.\nIFN is known to induce tumor necrosis factor (TNF) and its receptors (Lau et al. AIDS Research and Human Retroviruses 7:545 (1991)), which enhances virus replication (Matsuyama et al., Proc. Natl. Acad. Sci. USA 86:2365 (1989)). In addition to its presence in the circulation, IFNs have also been found in the cerebrospinal fluid in some patients with psychiatric mid neurologic diseases (Lebikoa et al., Acta. Biot Med Germ. 38:879 (1979); Preble et al, Am. J. Psychiatry, 142:10 (1985)), as well as in patients with rheumatoid arthritis. Therefore, since healthy people do not have interferons in their spinal or synovial fluids, the inventors have suggested that one or more alpha IFNs may be involved in the development of the initial autoimmune disease response. Consequently, the removal and/or neutralization of IFN\u03b1 has been proposed as a method of treatment of patients with autoimmune disease, including AIDS. The appearance of cytokines and"} +{"output_text": " and the manufacturing costs are also increased.\nThe information disclosed in this Background of the Invention section is only for enhancement of understanding of the background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art that is already known in this country to a person of ordinary skill in the art.", "input_text": " from the evaporator 80 to the condensed water outlet 64.\nHowever, water is flown into the blower 10 from the outside air intake 26 due to various environments, for instance, when the vehicle travels on a waterway, travels in the rain, or is washed. When water is induced into the blower 10, the blower 10 cannot be operated due to a damage of the motor 50. Therefore, a technology to discharge water induced into the blower 10 to the condensed water outlet 64 has been disclosed.\nThat is, as shown in FIG. 1, when a drain pipe 90 is connected from the bottom of the intake duct 20 to the upper portion of the insulator 70 mounted on the air conditioning case 60, water induced into the intake duct 20 through the outside air intake 26 is induced to the upper portion of the insulator 70, so that the water can be drained to the outside through the drain hole 64 of the insulator 70 and the condensed water outlet 64.\nHowever, the conventional two layer type air conditioner has several problems in that water and condensed water may flow backward to the blower 10 since static pressure directing to the evaporator 80 is higher than that directing to the bottom of the intake duct 20, and in that the flow channel for naturally draining water to the outside may be stopped due to its small area. Additionally, the blower 10 may be sealed in order to prevent backflow of water or condensed water to the blower 10. At this time, to improve sealing efficiency, a sealing rib (not shown) protrudes to the side surface of the air conditioning case 60, and a number of screws are used for enhancing its assembling performance, thereby the manufacturing costs of the conventional two layer type air conditioner is increased. Furthermore, in case where water induced into the blower 10 is drained to the drain hole, the number of components of the air conditioner is increased,"} +{"output_text": " be seen from FIG. 1, however, the contact pin 10 is not provided with any elastic member. Therefore, the wiping action is not sufficient. Second, the contact pin 10 is provided with a contact portion 14 having a length of about 0.5 mm. Therefore, the length of the electrical connection between the contact portion 14 and the PCB terminal 13 is long. As a result, the inductance of the contact pin 10 becomes larger. Third, the contact pin 10 is provided with a contact", "input_text": " type. For this test, the socket is mounted on a printed circuit board (PCB) connected to measurement equipments. The IC circuit package is mounted on the socket. At this time, an electrical connection element, such as a contact pin is required for connecting the leads of the IC package to the terminals of the PCB.\nIn order to design a contact pin performing this function, the following four requirements should be considered. First, it should maintain good contact be with the tested subject. For this, the contact portion of the contact pin should maintain contact with the IC package while performing a wiping action. Second, it is required that the length of the electrical connection between contact points should be short as possible. In particular, when the contact pin is used to perform a high-frequency IC test, a long length of electrical connection may act as an impedance since the inductance of the contact pin becomes larger. Third, the contact pin should not cause any bending of an IC lead during the test. Bending of an IC lead during the test substantially degrades the yield of the IC. Fourth, the contact pin and the elastic members supporting them should have a long life. If any of the contact pins fails, the entire test socket should be replaced.\nVarious structures of contact pins in consideration of these requirements have been developed and used. FIGS. 1xcx9c3 are cross-sectional views of conventional contact pin structures mounted on a conventional socket according to their respective use.\nThe contact pin shown in FIG. 1 is so called Yamaichi socket pin, which is widely used in integrated circuit tester sockets. The Yamaichi socket pin, however, has the following disadvantages in view of the above four requirements. First, when a lead 12 of an IC 11 is pressed against the contact pin 10, some degree of wiping action is caused at a potion contacting a PCB terminal 13. As can"} +{"output_text": "phenylalanine 1-methyl ester is used as the active ingredient.\nThe present invention relates to a method for the preparation of N-[N-[5-[4-(aminoiminomethyl)-phenyl]-1-oxopentyl]-L-xcex1-aspartyl]-L-phenylalanine 1-methyl ester, which is a useful intermediate for the preparation of a compound having an angiotensin converting enzyme (hereinafter referred to as ACE) inhibitory activity, and a process for the preparation of the", "input_text": " linear or branched saturated hydrocarbon group having 1 to 6 carbon atoms. Illustrative of such groups are methyl, ethyl, propyl, isopropyl and butyl.\nThe term xe2x80x9ccompositionxe2x80x9d as used herein means a product which results from the mixing or combining of more than one element or ingredient.\nThe term xe2x80x9cpharmaceutically-acceptable carrierxe2x80x9d as used herein means a pharmaceutically acceptable material, composition or vehicle, such as a liquid or solid filler, diluent, excipient, or solvent involved in carrying or transporting a chemical agent from one organ or portion of the body to another organ or portion of the body.\nThe term xe2x80x9ctransdermal deliveryxe2x80x9d as used herein means administration of the pharmaceutical composition topically to the skin wherein the active ingredient, or its pharmaceutically acceptable salts, will be percutaneously delivered in a therapeutically effective amount.\nThe term xe2x80x9cpenetration enhancersxe2x80x9d as used herein means compounds which enhance the percutaneous absorption of drugs. Selection of an effective penetration enhancer for a particular drug must be experimentally deduced. A penetration enhancer which works for one drug will not necessarily work for every other drug B. Idson, Cosmetics and Toiletries, 95, 59 (1980).\nThe term xe2x80x9cdelivery solventxe2x80x9d as used herein means a solvent in which the dissolution (dissolving of the active or compound into the solvent) is complete.\nIn the present invention N-[N-[5-[4-(aminoiminomethyl)-phenyl]-1-oxopentyl]-L-xcex1-aspartyl]-L-"} +{"output_text": " device determines the type of the recording disk by detecting the disk information from the recording disk.\nIn the above-stated information reproducing apparatus, the second determination device may includes: a detection device for detecting the disk information from the recording disk; and a disk determination device for determining on the basis of a detection of the detecting device whether the recording disk is the recording disk of the recordable type or the recording disk of the read-only type.\nThe recording disk has disk information. The disk information", "input_text": " on the basis of a detection of the detecting device whether the recording disk is the recording disk of the recordable type or the recording disk of the read-only type.\nIn the recordable type recording disk, the track has the wobble. In the read-only type recording disk, the track does not have the wobble. The second determination device determines the type of recording disk by detecting presence or absence of the wobble.\nIn the above-stated information reproducing apparatus, the second determination device may includes: a detection device for detecting a loop track formed on the surface of the recording disk; and a disk determination device for determining on the basis of a detection of the detecting device whether the recording disk is the disk of the recordable type or the disk of the read-only type.\nEach of the recording disk of the recordable type and the recording disk of the read-only type has a spiral track. The spiral track is formed on a surface of the recording disk. The record information is recorded on the spiral track. Only the recording disk of the recordable type further has a loop track. The loop track is formed on the surface of the recording disk together with the spiral track. The loop track is different from the spiral track in form. The second determination device determines the type of recording disk by detecting presence or absence of the loop track.\nIn the above-stated information reproducing apparatus, the second determination device may includes: a detection device for detecting the disk information from the recording disk; and a disk determination device for determining on the basis of a detection of the detecting device whether the recording disk is the recording disk of the recordable type or the recording disk of the read-only type.\nThe recording disk has disk information. The disk information is recorded on the recording disk as digital data. The disk information indicates the type of the recording disk. The second determination"} +{"output_text": " objects. The wavefronts are approximated by a plurality of two-dimensional images that are captured from different angles. The images are then combined to form a three-dimensional image. The three-dimensional image is created by combining the wavefronts that emanate from the three-dimensional objects.\nThe wavefronts that emanate from the three-dimensional objects are approximated by a plurality of two-dimensional images that are captured from different angles. The images are then combined to form a three-", "input_text": " to the stop position of the wiper blade, by for example, providing the raindrop sensor at the lower part of the windshield, it may happen that the wiper blade has already passed the sensing range before the outputting of the operating signal upon the shifting of the operational state of the wiper motor from the stopped state to the operating state. In such a case, it is not possible to set the time period, during which the wiper blade passes the sensing range of the raindrop sensor, as the raindrop quantity sensing prohibited time period. As a result, it is not possible to eliminate the above-described influences on the result of the determination of the quantity of raindrops at the time of passing of the wiper blade through the sensing range of the raindrop sensor. Thereby, it is difficult to accurately determine the quantity of raindrops in the sensing range of the raindrop sensor.\nIn order to avoid the above disadvantages caused by the time difference, it is conceivable to set the sensing range of the raindrop sensor remote from the stop position of the wiper blade. However, in such a case where the sensing range of the raindrop sensor is set remote from the stop position of the wiper blade, the sensing range of the raindrop sensor is normally placed in a vertical center part of the windshield. When the raindrop sensor is placed in such a location, the sight of the user of the vehicle is disadvantageously reduced or interfered. Furthermore, the positioning of the raindrop sensor in such a location is prohibited by the law in many countries. This disclosure relates generally to electronic display technology and more specifically to multi-view three-dimensional parallax displays.\nIt is known that it is possible to create a three-dimensional image by approximating the wavefronts that emanate from three-dimensional"} +{"output_text": " method and apparatus for providing a web-based service to a client system. The web-based service includes a plurality of web pages that are stored on a server system. The web pages are accessible to the client system via a network. The web pages are generated by a web server system. The web pages are generated by a web server system that is remote from the client system. The web pages are generated by a web server system that is remote from the client system. The web pages are generated by", "input_text": " products, and directories of on-line services, among others. The service may make the information available free of charge, or for a fee, and may be on publicly accessible or private computer systems.\nThe user of an on-line service uses a program on the client system to access the information managed by the on-line service. Possible user capabilities include viewing, searching, downloading, printing, and filing the information managed by the server.\nU.S. Pat. No. 5,870,552 to Dozier, et al. (\u201cDozier\u201d), teaches a development platform technology for publishing hypermedia documents across wide area networks (WAN). Generally, Dozier addresses the problem of editing hypermedia documents on WAN servers for WAN publishing. Dozier notes that it is not generally possible to \u201copen\u201d multiple WAN documents for editing and to transfer text, images, and URL's among those documents in the seamless fashion as is presently done with typical word processors for local computer documents. Also, Dozier notes that current web authoring tools generally do not provide full WYSIWYG (\u201cWhat You See Is What You Get\u201d) feedback as to HTML markups and hypermedia links. However, Dozier does not teach a method to effectuate sharing of a device/software independent image in a mainframe printing environment.\nU.S. Pat. No. 5,793,966 to Amstein et al. (\u201cAmstein\u201d) also teaches a client/server system, using a web server that allows for the creation and maintenance of an on-line service using a client system that remotely causes the server to perform operations required in the authoring process. However, Amstein does not address sharing a device/software independent image in a mainframe printing environment.\nU.S. Pat. No. 5,987,480 to Donohue et al. teaches a"} +{"output_text": " symbol duration of 1.25 microseconds, the maximum data rate is 11.2 Mbits/s.\nThe IEEE 802.11b standard specifies a maximum data rate of 11.2 Mbits/s. The maximum data rate is achieved by using a 64 chip spreading code and a symbol duration of 1.25 microseconds. The maximum data rate is achieved by using a 64 chip spreading code and a symbol duration of 1.25 microseconds. The maximum data rate is achieved by using", "input_text": "lapping DSSS channels in the ISM frequency band.\nThe preamble of the IEEE 802.11 data packet is used by the receiver to initiate spreading code synchronization is always transmitted as the DBPSK wave form. This permits all receivers to identify the transmitted waveform and, if the receiver is capable, switch to a higher rate mode of operation for interaction a particular WLAN device. The header of an IEEE 802.11 data packet which includes a cyclic redundancy check code, a packet payload transmission rate indicator, and payload length signal may be transmitted as either a DBPSK or DQPSK waveform.\nTo achieve higher data rates, the IEEE 802.11b revision adopts Complementary Code Keying (CCK) to replace the 11-chip Barker sequence for modulating data packet payloads. Complementary codes or binary, complementary sequences are polyphase codes comprising a pair of equal finite length sequences having the property that the number of pairs of like elements with any given separation in one series is equal to the number of pairs of unlike elements with the same separation in the second series. As a set, these code sequences have unique mathematical properties that facilitate distinguishing between code words at the receiver even in the presence of substantial noise and multipath interference. For an 11 Mbits/s data rate the information data stream is divided into eight bit segments. The values of six of the data bits are used to generate one of 64 unique subcodes. The values of the two remaining data bits are used to select one of the DQPSK phases for rotating the selected subcode producing 256 possible codewords for transmission. Systems operating in the 5.5 Mbits/s mode use two data bits to generate one of four subcodes and two bits are used to select one of the four DQPSK phases. With a symbol rate of 1.375 Msymbols/s, an eight chip spreading code, and a"} +{"output_text": "ink to adjust the transmission parameters of the uplink signal. The IR process is then repeated until the BS receives uplink signals from all remote users that are attempting to establish a link with the BS.\nUnfortunately, the IR process is not always successful. For example, the IR process may fail if the remote user is not able to receive the downlink control channel. In addition, the IR process may fail if the remote user is not able to synchronize its uplink signal with the downlink", "input_text": " initial ranging in wireless communication systems.\n2. Related Art\nIn the field of wireless communication technology, recent demands for high data rates have stimulated intense research activity in multicarrier modulation techniques. Particular interest has been directed to orthogonal frequency-division multiple-access (OFDMA) wireless communication systems, which have become part of the Institute of Electrical and Electronics (IEEE) 802.16 family of standards for wireless metropolitan area networks (WMANs). OFDMA systems allow for the transmission of digital information between a base station (BS) and a plurality of remote user devices, such as cellular telephones, wireless handheld computers, etc. OFDMA systems split the available bandwidth into smaller subchannels composed by a set of orthogonal subcarriers, which are assigned to different users to simultaneously communicate with the BS. The user signals are received at the BS as a series of orthogonal frequency division multiplexing (OFDM) blocks, from which information data are extracted by the BS. Unfortunately, OFDMA systems are extremely sensitive to timing errors and carrier frequency offsets (CFOs) that may occur between uplink signals and the local references at the BS. Timing errors give rise to interblock interference (IBI), while inaccurate compensation of the CFOs destroys orthogonality among subcarriers and produces interchannel interference (ICI) as well as multiple access interference (MAI).\nIn an attempt to alleviate these drawbacks of OFDMA systems, the IEEE 802.16 standards specify a synchronization procedure called Initial Ranging (IR), wherein users that intend to establish a link with the BS adjust their transmission parameters so that uplink signals arrive at the BS synchronously and with approximately the same power level. In its basic form, the IR process develops through the following steps. First, a remote user computes frequency and timing estimates on the basis of a downlink control channel. The estimated parameters are then used in the upl"} +{"output_text": " member and a rotating member. The rotating member is mounted for rotation about an axis of rotation. The stationary member is fixed to a structure that is not rotating. The rotating member is mounted for rotation about the axis of rotation. The rotating member is mounted for rotation about the axis of rotation by bearings. The bearings are located between the rotating member and the stationary member. The bearings are lubricated by a lubricant. The lubricant is contained within a lubricant reservoir. The lubricant reservoir is located", "input_text": " not necessitating a prior knowledge of required stud position, they are all very time and labor intensive since approximately three studs per square foot of surface area are required to hold the insulation in place. 1. Field of the Invention\nThe invention relates to a dynamoelectric, rotating machine; and more particularly, to an axial airgap, dynamoelectric, rotating machine comprising a rotor assembly and a stator assembly that includes a frontiron section, a backiron section, and a plurality of stator tooth sections.\n2. Description of the Prior Art\nThe electric motor and generator industry is continuously searching for ways to provide dynamoelectric, rotating machines with increased efficiencies and power densities. As used herein, the term \u201cmotor\u201d refers to all classes of motoring and generating machines which convert electrical energy to rotational motion and vice versa. Such machines include devices that may alternatively function as motors, generators, and regenerative motors. The term \u201cregenerative motor\u201d is used herein to refer to a device that may be operated as either an electric motor or a generator. A wide variety of motors are known, including permanent magnet, wound field, induction, variable reluctance, switched reluctance, and brush and brushless types. They may be energized directly from a source of direct or alternating current provided by the electric utility grid, batteries, or other alternative source. Alternatively, they may be supplied by current having the requisite waveform that is synthesized using electronic drive circuitry. Rotational energy derived from any mechanical source may drive a generator. The generator's output may be connected directly to a load or conditioned using power electronic circuitry. Optionally, a given machine is connected to a mechanical source that functions as either a source or sink of mechanical energy during different periods in its operation. The machine thus can act as a regenerative motor, e.g. by connection through power conditioning circuitry capable of four-quadrant operation.\nRotating machines ordinarily include a stationary"} +{"output_text": " deposited under ATCC Accession No. PTA-4328. The inbred corn seed of the invention may be provided as an essentially homogeneous population of inbred corn seed of the variety designated I181664. Essentially homogeneous populations of inbred seed are those that consist essentially of the particular inbred seed, and are generally free from substantial numbers of other seed, so that the inbred seed forms between about 90% and about 100% of the total seed, and preferably, between about 95%", "input_text": "breeding process in corn, the vigor of the plants decreases. Vigor is restored when two unrelated inbred plants are crossed to produce the hybrid progeny (F1). An important consequence of the homozygosity and homogeneity of the inbred plants is that the hybrid between any two inbreds is always the same. Once the inbreds that give a superior hybrid have been identified, hybrid seed can be reproduced indefinitely as long as the homogeneity of the inbred parents is maintained. Conversely, much of the hybrid vigor exhibited by F1 hybrids is lost in the next generation (F2). Consequently, seed from hybrid varieties is not used for planting stock. It is not generally beneficial for farmers to save seed of F1 hybrids. Rather, farmers purchase F1 hybrid seed for planting every year.\nNorth American farmers plant tens of millions of acres of corn at the present time and there are extensive national and international commercial corn breeding programs. A continuing goal of these corn breeding programs is to develop corn hybrids that are based on stable inbred plants and have one or more desirable characteristics. To accomplish this goal, the corn breeder must select and develop superior inbred parental plants.\nIn one aspect, the present invention provides a corn plant of the variety designated I181664. Also provided are corn plants having all the physiological and morphological characteristics of the inbred corn variety I181664. The inbred corn plant of the invention may further comprise, or have, a cytoplasmic or nuclear factor that is capable of conferring male sterility or otherwise preventing self-pollination, such as by self-incompatibility. Parts of the corn plant of the present invention are also provided, for example, pollen obtained from an inbred plant and an ovule of the inbred plant.\nThe invention also concerns seed of the corn plant I181664. A sample of this seed has been"} +{"output_text": "; Brenman et al., 2000; Brenman et al., 2001; Brenman et al., 2002; Brenman et al., 2003; Brenman et al., 2004; Brenman et al., 2005; Brenman et al., 2006). The PDZ domain is a common protein-protein interaction domain found in a variety of proteins including the dystrophin protein, \u03b1-syntrophin, and the neuronal nitric oxide synthase (nNOS) (Brenman et al., 1996", "input_text": " microgene because it can recover the muscle specific force to the same level as the full-length dystrophin gene (Harper et al., 2002; Lai et al., 2005). However, the minigene cannot restore nNOS. As a matter of fact, none of the existing mini- or micro-dystrophin genes have the ability to recruit nNOS to the sarcolemma (Table 1) (Chao et al., 1996; Crawford et al., 2000; Warner et al., 2002; Wells et al., 2003; Torelli et al., 2004; Lai et al., 2005; Yue et al., 2006; Li et al., 2006; Judge et al., 2006). The failure to restore sarcolemmal nNOS will significantly reduce the therapeutic efficacy of the minimized dystrophin genes.\nPreviously it was thought that nNOS is recruited to the sarcolemma through the C-terminal domain of the dystrophin protein (Brenman et al., 1995; Brenman et al., 1996). The full-length dystrophin protein has four domains including the N-terminal domain, mid-rod domain, cysteine-rich domain, and C-terminal domain. The N-terminal domain and a portion of the mid-rod domain interact with cytoskeleton protein F-actin. The mid-rod domain contains 24 spectrin-like repeats and four hinges. The cysteine-rich domain interacts with transmembrane protein dystroglycan to connect dystrophin to the extracellular matrix. The C-terminal domain contains two syntrophin binding sites and one dystrobrevin binding site. Several studies suggest that nNOS is recruited to the sarcolemma through a PDZ/PDZ domain interaction between nNOS and \u03b1-syntrophin (Brenman et al., 1996; Hillier et al., 1999; Kameya et al., 1999"} +{"output_text": ".\nA further object of the invention is to provide a voice activity detector that is not sensitive to noise.\nA further object of the invention is to provide a voice activity detector that is not sensitive to the presence of a voice signal in the presence of noise.\nA further object of the invention is to provide a voice activity detector that is not sensitive to the presence of a voice signal in the presence of a voice signal.\nA further object of the invention is to provide a voice activity detector", "input_text": " takes too long. The Eryilmaz patent attempts to simplify the amount of computation but still requires manipulation of significant amounts of data. All these systems manipulate amplitude data, or data derived from amplitude, up to the point of making a binary value signal indicating voice.\nOne can increase the speed of a system by reducing the amount of data being processed. Unfortunately, this typically reduces the resolution of the system. For example, all other parameters being equal, eight bit data is more quickly processed than sixteen bit data. The problem is that resolution is reduced. In an acoustic environment, the quality or fidelity of the audio signal requires a minimum amount of data. Thus, the problem remains of speeding up a system other than by simply increasing the clock frequency.\nSome of the prior art systems use historical data, e.g. three occurrences of what is interpreted as a voice signal. Such systems require large amounts of memory to handle the historical data and the current data.\nVoice detection is not just used to determine transmit or receive. A reliable voice detection circuit is necessary in order to properly control echo cancelling circuitry, which, if activated at the wrong time, can severely distort a desired voice signal. In the prior art, this problem has not been solved satisfactorily.\nIn view of the foregoing, it is therefore an object of the invention to provide an improved method for analyzing the energy content of an incoming signal.\nAnother object of the invention is to provide a simple but effective circuit for detecting voice.\nA further object of the invention is to provide a circuit having dynamically adjustable thresholds for analyzing energy content of a speech signal.\nAnother object of the invention is to provide a voice activity detector that does not require large amounts of data for reliable detection of a voice signal.\nA further object of the invention is to provide an apparatus and a method for analyzing the envelope of a signal with minimal computation"} +{"output_text": " another aspect, the invention is an acoustic detection method comprising the steps of transmitting an acoustic signal employing a triplet-pair comb waveform to ensonify a target area, detecting acoustic reflections from the target area at a receiver transducer, generating a transducer output signal representing the acoustic reflections, and processing the transducer output signal to determine range and Doppler values for the target area.\nIn yet another aspect, the invention is an acoustic detection method comprising the steps of transmitting an acoustic signal employing a triplet-pair comb waveform", "input_text": " water at longer range. Because active sonar transmitters suitable for littoral operation are normally power- and duty-cycle-limited, there is a need for transmit waveforms with dynamic range limited to make use of as much available power as possible. Collins et al. suggest that the SFM waveform is preferred over the Cox comb waveform despite the resulting range-ambiguity problems because of the improved noise-limited performance of the higher average transmitter power available from SFM.\nThere is accordingly still a clearly-felt need in the art for an active sonar system that provides improved detection performance in either reverberation-limited or noise-limited littoral regions. These unresolved problems and deficiencies are clearly felt in the art and are solved by this invention in the manner described below.\nThis invention solves the active sonar comb-waveform power-limitation problem by introducing for the first time a system employing a new comb waveform herein denominated the triplet-pair comb waveform. Ambient noise-limited performance of the system of this invention is superior to that of systems employing other Doppler-sensitive waveforms such as the geometric comb waveform. Reverberation-limited performance of the system of this invention is slightly inferior to that of systems employing other Doppler-sensitive waveforms but this invention eliminates much of the range ambiguity problems seen with other non-comb waveforms.\nIt is a purpose of this invention to provide an active sonar system with improved noise-limited performance in littoral regions with reverberation.\nIn one aspect, the invention is an acoustic detection method comprising the steps of transmitting an acoustic signal employing a triplet-pair comb waveform to ensonify a target area, detecting acoustic reflections from the target area at a receiver transducer, generating a transducer output signal representing the acoustic reflections, and processing the transducer output signal to determine range and Doppler values for the target area.\nIn"} +{"output_text": " It is a mixture of Cinnamomi and Poria, which are the most commonly used herbs in TCM. The formula of BCP is: Cinnamomi (0.5-1.0 g), Poria (0.5-1.0 g), Radix Paeoniae Alba (0.5-1.0 g), Semen Persicae (0.5-1.0 g), and Radix Glycyrrhizae (", "input_text": " non-steroidal anti-flammatory drugs (NSAIDs) are given, since NSAIDs are capable of inhibiting cyclo-oxygenase and synthesis of prostaglandins.\nIn addition to NSAIDs, oral contraceptives, antispasmodics, and analgesics are commonly used by physicians. Androgen therapy is sometimes used, and a minor surgery called dilation and curettage (D&C) is also adopted in certain cases.\nTraditional treatments are associated with a variety of side effects, including ineffectiveness and drug tolerance, and have limited effects. That explains why primary dysmenorrhea is still a problem. The present invention was developed using the most common herbal formulations in Traditional Chinese Medicine (TCM). This invenmtion uses a different approaching characterized by steady effectiveness and relatively low toxicity.\nThe Cinnamomi and Poria composition is synthesized to protect women from the pain caused by most common pelvic diseases or disorders. In addition to treating primary and secondary dysmenorrhea and dysfunctional uterine bleeding caused by irregular shedding of uterine endometrium, the composition is also effective in treating chronic pelvic inflammations, inflammatory lower abdominal masses and small intramural hysteromyoma.\nLike most TCM products, the Cinnamomi and Poria composition is prepared from multiple medicinal herbal materials\u2014five cultivated natural plants called Ramulus Cinnamomi, Poria, Cortex Moutan, Radix Paeoniae Alba and Semen Persicae. Different in weights and portions of the medicinal herbs for the start-up materials and in the producing courses, Cinnamomi and Poria composition shares the same formula of five herbs with a previous medicinal preparation\u2014Bolus of Cinnamomi and Poria (BCP).\nBCP, which has long been approved as an effective cure, was a sample of success in medicinal practice in ancient China."} +{"output_text": "hole pressure. If the density is too high, the column weight will be too great, and the column will be unable to support the weight of the drill string. If the density is too low, the column weight will be too small, and the column will be unable to support the weight of the drill string. If the density is too high, the column weight will be too great, and the column will be unable to support the weight of the drill string. If the density is too low,", "input_text": "xe2x80x9d. One of the important aspects of the data collected during such a test is the pressure build-up information gathered after drawing the pressure down. From this data, information can be derived as to permeability, and size of the reservoir. Further, actual samples of the reservoir fluid are obtained, and tested to gather Pressure-Volume-Temperature data relevant to the reservoir\"\"s hydrocarbon distribution.\nIn order to perform these important tests, it is currently necessary to retrieve the drill string from the well borehole. Thereafter, a different tool, designed for the testing, is run into the well borehole. A wireline is often used to lower a test tool into the well borehole. The test tool sometimes utilizes packers for isolating the reservoir. Numerous communication devices have been designed which provide for manipulation of the test tool, or alternatively, provide for data transmission from the test tool. Some of those designs include signaling from the surface of the Earth with pressure pulses, through the fluid in the well borehole, to or from a downhole microprocessor located within, or associated with the test tool. Alternatively, a wire line can be lowered from the surface, into a landing receptacle located within a test tool, establishing electrical signal communication between the surface and the test assembly. Regardless of the type of test tool and type of communication system used, the amount of time and money required for retrieving the drill string and running a second test tool into the borehole is significant. Further, if the borehole is highly deviated, a wire line tool is difficult to use to perform the testing.\nThere is also another type of problem, related to downhole pressure conditions, which can occur during drilling. The density of the drilling fluid is calculated to achieve maximum drilling efficiency while maintaining safety, and the density is dependent upon the desired relationship between the weight of the drilling mud column and the down"} +{"output_text": " after a preliminary purification, is concentrated by evaporation and the lysine is crystallized from the concentrated solution. The lysine is then isolated by filtration and the filtrate is concentrated by evaporation. The lysine is then crystallized from the concentrated solution. The lysine is then isolated by filtration and the filtrate is concentrated by evaporation. The lysine is then crystallized from the concentrated solution. The lysine is then isolated by filtration and the filtrate is concentrated by evaporation. The lysine is then crystallized from", "input_text": " the fermentation broth separated from the biomass is acidified, preferably by the addition of hydrochloric acid (HCl) or sulfuric acid (H2SO4), to ease adsorption of the lysine on the ion-exchange resins. In addition to the L-lysine produced by fermentation, various other cations which are present in the fermentation broth are also bound. In general, various ion-exchange columns connected in sequence are necessary for obtaining a pure product. The adsorbed lysine is then preferably eluted by an ammoniacal solution and the ion-exchange column is regenerated. The lysine solution obtained in this way is then concentrated and lysine-HCl is obtained in crystalline form after neutralization with hydrochloric acid.\nAnother method enables lysine to be obtained in the form of a crystalline salt after purifying with activated carbon (SU-183581). The lysine-containing fermentation broth is inactivated by standard processes using moist heat and separated off from the biomass by filtration. After acidification of the filtrate to pH 5, 4-5% activated carbon is added with constant stirring at 50-55xc2x0 C., in order to separate off undesirable impurities from the filtrate and to prevent discoloration of the crystallizate. In a further filtration stage thereafter, the activated carbon is separated off and the dissolved sulfate is then precipitated as calcium sulfate by the addition of calcium hydroxide. This is filtered off, the ammonia content being removed in a rotary evaporator under vacuum and the solution being concentrated until crystallization occurs on cooling.\nThe disadvantage of these two preparation methods lies in the numerous individual stages and the complex cleaning processes using ion-exchange chromatography. The elimination of troublesome salts or the use of different elution media creates additional waste streams, which have either to be cleaned by complex methods or expensively disposed of.\nEP-B-0533039 counters these disadvantages in that all the fermentation feedstock, optionally"} +{"output_text": " side layer will be concentrated at one end of the float.\nThe machine side layer internal float is also limited in length. The machine side layer internal float is the weft binder yarn which is used to bind the machine side layer to the paper side layer. The machine side layer internal float is the weft binder yarn which is used to bind the machine side layer to the paper side layer. The machine side layer internal float is the weft binder yarn which is used to bind the machine side layer", "input_text": " two layers. There are three parameters which determine the fabric weave pattern. First, the paper side layer weft binder yarn internal float should be as long as possible. Second, the path of the weft binder yarn internal float should be as symmetrical as possible about the interlacing point with the machine side layer internal warp yarn float. Third, in order to protect the weft binder yarn from abrasion, the interlacing point should be as close as possible to the middle of the machine side layer internal warp yarn float.\nA second concept used in this invention is that all of the paper side layer weft yarns are substantially the same size. Although some are doubled as weft binder yarn pairs, only one pair member at a time occupies each segment in the unbroken weft path and therefore all of the weft binder yarns contribute to the properties of the paper side layer of the fabric.\nWithin these broad constraints, it is possible to create a forming fabric in which the weft yarns chosen as weft binder yarns are irregularly spaced.\nIt is thus apparent that the interlacing locations of the paper side layer and machine side layer internal floats in the fabrics of this invention should be chosen with some care. The limitation on both of these floats appears to be that each should be as long as is reasonably possible within the constraints of the two weave designs. For example, in its path in between the two layers, the paper side float has essentially a xe2x80x9cVxe2x80x9d shape: as the float length increases, the V is flattened reducing the out of plane stresses imposed on the paper side layer. In a similar way, if the V shaped path is not symmetrical, and the interlacing point is close to one end of the float, or the float is relatively short, any stresses imposed on the paper"} +{"output_text": "olved solids separation step.\nIt is also known to remove sulfur from coal by reacting the coal with a hydrogen-containing gas in the presence of a catalyst. For example, U.S. Pat. No. 3,923,926 discloses a process for the removal of sulfur from coal by reacting the coal with hydrogen in the presence of a catalyst. The catalyst is a metal or metal compound of the platinum group of the periodic table. The patent discloses that the catalyst may be a metal", "input_text": " coal is burned to produce energy.\nA variety of methods have been suggested to reduce the discharge of such sulfur compounds into the atmosphere when sulfur-containing fuels such as coal are burned. Two general methods have been tried. One method involves removing sulfur from stack gases after the sulfur-containing fuel is burned, whereas the other method removes sulfur from the fuel before it is burned. While numerous methods have been tried for stack gas cleaning, none appear to be simple or low cost. The inherent difficulties of such an approach are the enormous volumes of stack gas that must be processed and the low concentration of sulfur in these gases.\nIt is desirable to reduce the sulfur content of the coal initially. If this is successful, then the fuel can be burned as it has been in the past -- i.e. without material change in the operation of furnaces, boilers, and utility plants. Moreover, sulfur removal may be accomplished at one location without the need to provide extensive sulfur removal facilities at each location. Accordingly, it is very desirable to be able to substantially reduce the sulfur content of a coal before it is burned as a fuel or otherwise gasified and/or liquified for further processing into specific fuels.\nThe solvent-refined coal or coal extraction process reduces the sulfur content of coal by first dissolving the coal in a suitable solvent to produce a mixture of liquid and undissolved solids from which the solid may be removed by filtration or other conventional solids -- liquid separation processes. The dissolution step is often carried out under hydrogen pressure. Solvent is recovered from the filtrate by means such as vacuum distillation. The distillation residue can be handled in either solid or liquid form and is a low ash, low sulfur material known as solvent refined coal or coal extract. However, such a process has relatively expensive and complex equipment requirements. Furthermore, considerable difficulty has been encountered in the solvent-extract/undiss"} +{"output_text": " standardized.\nA typical Flash memory comprises a memory array that includes a large number of memory cells arranged in row and column fashion. Each of the memory cells includes a floating gate field-effect transistor capable of holding a charge. The cells are usually grouped into blocks of 64 cells each, which are then coupled together into a complete row of cells. Each of the cells within a block can be electrically programmed in a random basis by charging the floating gate. The charge can be removed from the floating gate by", "input_text": " the image to reduce LER in the transferred patterns are also needed. Memory devices are typically provided as internal storage areas in the computer. The term memory identifies data storage that comes in the form of integrated circuit chips. There are several different types of memory. One type is RAM (random-access memory). This is typically used as main memory in a computer environment. RAM is read and write memory; that is, you can both write data into RAM and read data from RAM. This is in contrast to ROM, which permits you only to read data. Most RAM is volatile, which means that it requires a steady flow of electricity to maintain its contents. As soon as the power is turned off, whatever data was in RAM is lost.\nComputers almost always contain a small amount of read-only memory (ROM) that holds instructions for starting up the computer. Unlike RAM, ROM cannot be written to. An EEPROM (electrically erasable programmable read-only memory) is a special type non-volatile ROM that can be erased by exposing it to an electrical charge. Like other types of ROM, EEPROM is traditionally not as fast as RAM. EEPROM comprise a large number of memory cells having electrically isolated gates (floating gates). Data is stored in the memory cells in the form of charge on the floating gates. Charge is transported to or removed from the floating gates by programming and erase operations, respectively.\nYet another type of non-volatile memory is a Flash memory. A Flash memory is a type of EEPROM that can be erased and reprogrammed in blocks instead of one byte at a time. Many modern PCs have their BIOS stored on a flash memory chip so that it can easily be updated if necessary. Such a BIOS is sometimes called a flash BIOS. Flash memory is also popular in modems because it enables the modem manufacturer to support new protocols as they become"} +{"output_text": " in a large hall will be reproduced as if the listener were in the same room as the recording. Sound recorded in a small club will be reproduced as if the listener were in the same room as the recording.\nThe present invention is directed to a method and apparatus for providing a high-speed, high-bandwidth, low-latency, low-power-consumption, high-reliability, high-speed, high-bandwidth, low-latency, low-power-", "input_text": "roying effect using the laser energy disclosed herein, there is less risk of damage to surrounding soft and hard tissue, less pain during and after surgery, and healing of the gingival tissue occurs faster than it otherwise would. In addition, the controlled damage that does occur to surrounding tissue is beneficial because the use of the laser on the infected soft tissue creates a new adjacent soft tissue surface that can readhere to the tooth, thereby closing the periodontal pocket. Furthermore, the preferred photosensitizing formula is not carcinogenic or otherwise harmful to the patient.\nThese and other features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter. In the average movie theater, two types of \"surround\" systems are used--the 70 mm 6-track magnetic system, and the more common 35mm optical arrangement. The former uses a magnetic strip attached to the film to supply six discrete channels, and the latter uses two optical audio tracks. This two-channel system is the basis for home surround sound decoders.\nEvery stereo videodisc, tape and MTS broadcast that was surround encoded still contains the same rear channel information as the two-channel magnetic master from which the theatrical 35mm optical soundtrack was produced. In other words, your stereo videotape or disc of Star Trek I, II, III, Raiders of the Lost Ark, Superman and Star Wars can be decoded to produce surround sound at home. In addition, LPs, CDs and any stereo audio material can benefit from surround sound decoding. Ambiance extraction is a pleasant side effect that many decoders provide. In a nutshell, if the recording was made in a large hall, or a small club, \"surround sound\" will reproduce the recording environment faithfully.\nAssuming the listener is seated centered between the two speakers, sound which is recorded"} +{"output_text": " 3, 1992, there is described a method of forming a copper interconnect structure. The method includes the steps of depositing a barrier layer over a substrate, depositing a copper layer over the barrier layer, and depositing a layer of palladium silicide over the copper layer. The palladium silicide layer is formed by heating the copper layer to a temperature of about 400.degree. C. to about 600.degree. C. The palladium silicide layer is then selectively etched to form", "input_text": " metallization maintained at a depth of less than two microns. An infrared laser then moves around the surface and selectively vaporizes the metallization, leaving a desired printed circuit pattern. The remaining metallization is plated to a useable depth. Using a second technique, a fiber optic bundle is machined on one end to mate with the three dimensional surface. The three dimensional surface, metallized and coated with photo-resist, resides in intimate contact with this first end. A second end of the cable is flat and resides in intimate contact with two-dimensional master photo artwork. A pattern is exposed on the photo-resist through the fiber optic bundle, and the metallization is etched using conventional techniques.\nIn U.S. Pat. No. 5,308,796, issued May 3, 1994, there is described a deposition process which involves formation of a silicide, such as palladium silicide, in the region upon which copper deposition is desired. The silicide acts as a catalyst to initiate reduction of copper ions from an electro-less plating bath to produce an acceptably low resistance copper deposition. Thus, for example, in the case of producing an interconnect involving a silicon region at the bottom of the interconnect structure defined through a silicon dioxide region, palladium is first evaporated over the entire surface and is heated to form palladium silicide only at the base of the via structure. The palladium on the silicon dioxide surface is un-reacted. A selective etch is then used to remove the un-reacted surface palladium. Upon substrate immersion in a conventional electro-less copper plating bath, copper deposition proceeds selectively on the palladium silicide surfaces and continues up through the interconnect. The silicon dioxide surface is non-catalytic to the plating step and induces essentially no copper deposition.\nIn U.S. Pat. No. 5,160,579, issued Nov."} +{"output_text": ", the cost of the key was increased.\nAnother solution to the problem of providing a single key that could be used to select a plurality of letters was to provide a key that was structured to overlay two conductors. That is, the key was structured to overlay a primary conductor and a secondary conductor. The secondary conductor was disposed adjacent to the primary conductor and was disposed on the opposite side of the key from the primary conductor. The secondary conductor was disposed adjacent to the primary conductor and was disposed on the", "input_text": " having to depress an extra key.\nOne means of addressing this disadvantage was provided by software. Disambiguation routines were created that suggested one of the letters based on, for example, a subsequent keystroke. That is, if the user had selected the letters \u201cQU\u201d and the next key depressed was the \u201cA/S\u201d key, the software would suggest the use of the letter \u201cA\u201d because the letter combination \u201cQU\u201d is almost always followed by a vowel. Such software solution would typically provide the user with a list of the less preferred letter combinations which the user could select if desired. This means was further improved by providing three conductors, a primary, secondary and tertiary conductor, under each key and which were operable with the software. The primary conductor was engaged when the key was depressed. The secondary and tertiary conductors were disposed adjacent to opposing lateral sides of the keys and were alternately closed when the user depressed one side of the key or the other. Thus, where the letter \u201cA\u201d was located on the left side of a single key, and the secondary conductor was located under the left lateral side of the key, when a user depressed the \u201cA/S\u201d key and pressed on the left side of the key, the primary conductor was engaged indicating the key had been depressed and, if the secondary conductor was depressed, the software would weigh, that is favor, the letter \u201cA\u201d over the letter \u201cS\u201d.\nThis solution, however, has disadvantages as well. For example, some keys may only be associated with a single letter thereby making the secondary and tertiary conductors redundant. Also, some keys, such as a \u201cZ/X\u201d key have letters that are so relatively uncommon in use that the software could reliably choose the proper letter the user intended to use. Again, the secondary and tertiary conductors were essentially wasted. Additionally, where each key was structured to overlay three conductors"} +{"output_text": ".\u201d However, the \u201cFolder B\u201d is always displayed. It would be extremely undesirable for a user searching for a desired tune to be always displayed such a meaningless folder, especially on an apparatus with a small display area. This case is not shown in FIG. 6(b), but as seen from FIG. 5, the \u201cFolder F\u201d is always displayed by scrolling FIG. 6(b).\nFurther, in the example shown in FIG. 6(b), no tune exists directly under the", "input_text": " a storage medium aligned with each other regardless of their hierarchical levels, according to a playlist management table. JP-A-2007-250036 discloses a method for displaying structure information by analyzing hierarchical structure of folders stored in a storage medium, creating the structure information including information on the hierarchical structure and information on data stored in the folders, and marking a folder that does not include available data as empty folder.\nAs described above, when listing folders included in a database on a small display area such as the display of a car audio apparatus as shown in FIG. 6 in a manner understandable at a glance to drivers, a method of not displaying the hierarchical structure of the folders, as shown in FIG. 6(a), is often used. Further, the hierarchical structure of the folders may be displayed according to a method used by conventional personal computers, as shown in FIG. 6(b).\nHowever, according to this displaying method, since the entire hierarchical structure of the folders is simply displayed as it is, even a folder directly under which no file (tune) exists is displayed. In the example shown in FIG. 6(b), although no tune exists directly under the \u201cFolder A\u201d and any folder under the \u201cFolder A,\u201d the \u201cFolder A\u201d is always displayed. It would be extremely undesirable for a user searching for a desired tune to be always displayed such a meaningless folder, especially on an apparatus with a small display area. This case is not shown in FIG. 6(b), but as seen from FIG. 5, the \u201cFolder F\u201d is always displayed by scrolling FIG. 6(b).\nFurther, in the example shown in FIG. 6(b), no tune exists directly under the \u201cFolder B\u201d and the \u201cFolder F.\u201d Focusing the \u201cFolder B,\u201d tunes only exist directly under the \u201cFolder D\u201d under the \u201cFolder C\u201d under the \u201cFolder B"} +{"output_text": " the free magnetic layer is changed by an external magnetic field.\nAccording to the present invention, there is provided a magnetic memory device comprising: a memory cell having a magnetoresistive element; a plurality of word lines and a plurality of bit lines crossing each other; a plurality of word lines and a plurality of bit lines crossing each other; a plurality of word lines and a plurality of bit lines crossing each other; a plurality of word lines and a plurality of bit lines crossing each other; a plurality", "input_text": " pair of fixed magnetic layers, a pair of conductive layers imparting a sensing current to the free magnetic layer, the pair of nonmagnetic conductive layers, and the pair of fixed magnetic layers, and a pair of bias layers for aligning a magnetization direction of the free magnetic layer, wherein the free magnetic layer is a laminate composed of at least 2L ferromagnetic layers with a nonmagnetic interlayer provided therebetween, the L being an integer of 1 or more, in which magnetization directions of the ferromagnetic layers adjacent to each other are antiparallel to each other so that the entire free magnetic layer is in a ferrimagnetic state; one of the pair of fixed magnetic layers is a laminate composed of at least 2M ferromagnetic layers with a nonmagnetic layer provided therebetween, the M being an integer of 1 or more, in which magnetization directions of the ferromagnetic layers adjacent to each other are antiparallel to each other so that the entire fixed magnetic layer is in a ferrimagnetic state, and a magnetization direction of the entire fixed magnetic layer is fixed in a direction crossing the magnetization direction of the entire free magnetic layer by an exchange coupling magnetic field formed by the fixed magnetic layer and one of the antiferromagnetic layer adjacent thereto; the other fixed magnetic layer is one of a single ferromagnetic layer and a laminate composed of at least 2N+1 ferromagnetic layers with a nonmagnetic layer provided therebetween, the N being an integer of 1 or more, magnetization directions of the ferromagnetic layers adjacent to each other being antiparallel to each other so that the entire other fixed magnetic layer is in a ferrimagnetic state, and a magnetization direction of the entire other fixed magnetic layer is fixed so as to be antiparallel to the magnetization direction of the fixed magnetic layer by an exchange coupling magnetic field formed by the other fixed magnetic layer and the other antiferromagnetic layer adjacent thereto; and a magnetization direction of"} +{"output_text": " the provision of a new and improved apparatus or system for applying wrapping film to palletized loads or products. The apparatus or system of the present invention comprises a film roll dispensing mechanism, a film roll support mechanism, and a film roll support and dispensing mechanism. The film roll support and dispensing mechanism is disposed at a location remote from the film roll dispensing mechanism and is operable to support the film roll at a location remote from the film roll dispensing mechanism. The film roll support and dispensing mechanism is also", "input_text": " PRIOR ART apparatus, systems, or methods of operating the same, are overcome.\nAn additional object of the present invention is to provide a new and improved apparatus or system for applying wrapping film to palletized loads or products wherein the wrapping film can be applied to or wrapped around the palletized loads or products by means of operator personnel who can simply walk around the pallet upon which the loads or products are disposed and simultaneously push or guide the roll of wrapping film around the palletized loads or products whereby the palletized loads or products are accordingly packaged or wrapped within such wrapping film.\nA further object of the present invention is to provide a new and improved apparatus or system for applying packaging film to palletized loads or products wherein the packaging film can be applied to or wrapped around the entire vertical extent of the palletized loads or products by means of operator personnel who need not support the weight of the film roll, or the film roll and the film roll dispensing mechanism, and in addition need not bend down in order to wrap or apply the stretch film upon or to the lower extremity portions of the palletized loads or products.\nA still yet further object of the present invention is to provide a new and improved apparatus or system for applying packaging film to palletized loads or products wherein the film wrapping apparatus or system is truly portable and transportable so as to readily enable the manual wrapping of palletized loads or products with wrapping film at a particular location within a production facility, at different locations within a particular production facility, or at different production facilities.\nA last object of the present invention is to provide a new and improved apparatus or system for applying packaging film to palletized loads or products wherein the film wrapping apparatus or system is relatively simple in structure and economical to manufacture.\nThe foregoing and other objectives are achieved in accordance with the teachings and principles of the present invention through"} +{"output_text": " the lens system, and the sample, the electron beam is kept accelerated until just before it impinges onto the sample, thus to be highly energized. Therefore, the electron beam is likely to be affected by the electric field and the magnetic field of the E.times.B type filter, thus to be deflected toward the secondary electron detector.\nIn addition, the E.times.B type filter is arranged between the sample and the secondary electron detector, and the secondary electrons emitted from", "input_text": " a need for a sensitive inspection apparatus to be used in the semiconductor device manufacturing process for defect inspection in the pattern or the likes in semiconductor wafers. In this regard, there have been electron microscopes used as the inspection apparatus for such defect inspections, as disclosed in Japanese Patent Laid-open Publications Nos. Hei 2-142045 and Hei 5-258703.\nFor example, in the electron microscope as disclosed in Japanese Patent Laid-open Publication No. Hei 2-142045, an electron beam emitted from an electron gun is converged by an objective lens to irradiate a sample to be inspected, and secondary electrons emitted from the sample are detected by a secondary electron detector. In addition, in this electron microscope, a negative voltage is applied to the sample, and further an E.times.B type filter is arranged between the sample and the secondary electron detector, said filter having an electric field and a magnetic field crossed at right angles.\nWith such a configuration, this electron microscope allows a high resolution to be obtained by decelerating the electrons irradiated onto the sample by way of the negative voltage applied to the sample.\nFurther, the application of the negative voltage to the sample helps accelerate the secondary electrons emitted from the sample, and the accelerated secondary electrons are further deflected by the E.times.B type filter toward the secondary electron detector, thus to be efficiently detected by the secondary electron detector.\nIn those conventional apparatuses using the electron microscope as described above, the electron beam from the electron gun is kept accelerated to be highly energized until just before it impinges onto the sample, by a lens system such as an objective lens with a high voltage applied. Then, the negative voltage applied to the sample decelerates electrons impinging upon the sample, thus allowing a high resolution to be achieved.\nHowever, since the high voltage is applied to the objective lens,"} +{"output_text": " of the reactor. The condensed stream is then sent to a gasifier, where it is reacted with air and steam to produce a gas stream containing entrained coke particles. The gas stream is then sent to a heater, where it is heated to provide the heat required to maintain the fluidized bed. The heated gas is then sent to the reactor, where it is used to maintain the fluidized bed. The fluidized bed is maintained by the continuous addition of fresh feed and the removal of co", "input_text": " directed against the upper surface of the coke at a distance from the central discharge bore, thereby cutting the coke into pieces. The pieces drop out of the coke drum through the pilot hole. The cutting jet traverses the drum until the coke bed is completely removed.\nThe coke leaving ranges in size from large lumps to fine particles. To a considerable extent, the fines are separated from the larger pieces as the coke discharges into slotted bins or hopper cars, with the water draining off through the slots. This dispersion of fines in water is processed to recover the fines as solid fuel, and the water returns to the system for use in quenching and cutting.\nIn a flexicoking process, a material stream circulates continuously between a reactor and a heater. More specifically, a feed stream is fed into a fluidized bed, along with a stream of hot recirculating material. From the reactor, a stream containing coke is circulated to a heater vessel, where it is heated. The hot coke stream is sent from the heater to a gasifier, where it reacts with air and steam. The gasifier product gas, referred to as coke gas, containing entrained coke particles, is returned to the heater and cooled by cold coke from the reactor to provide a portion of the reactor heat requirement. A return stream of coke sent from the gasifier to the heater provides the remainder of the heat requirement. Hot coke gas leaving the heater is used to generate high-pressure steam before being processed for cleanup. Coke is continuously removed from the reactor.\nIn a fluid coking process, a fluidized bed reactor is used in conjunction with a burner to provide continuous coke production. The feed stream is introduced into a scrubber, where it exchanges heat with the reactor overhead effluent and condenses the heaviest fraction of the hydrocarbons leaving the top"} +{"output_text": " in the location server can be used to determine the location of the wireless device, for instance by trilateration, multilateration or triangulation.\nThe location server may be a standalone device or may be integrated with a location network. The location server may be a standalone device or may be integrated with a location network. The location server may be a standalone device or may be integrated with a location network. The location server may be a standalone device or may be integrated with a location network.", "input_text": " the location of a wireless device such as a mobile user terminal can be determined with respect to a location network comprising a plurality of wireless reference nodes, in some cases also referred to as anchor nodes. These anchors are wireless nodes whose locations are known a priori, typically being recorded in a location database which can be queried to look up the location of a node. The anchor nodes thus act as reference nodes for localization. Measurements are taken of the signals transmitted between the mobile device and a plurality of anchor nodes, for instance the RSSI (receiver signal strength indicator), ToA (time of arrival) and/or AoA (angle of arrival) of the respective signal. Given such a measurement from three or more nodes, the location of the mobile terminal may then be determined relative to the location network using techniques such as trilateration, multilateration or triangulation. Given the relative location of the mobile terminal and the known locations of the anchor nodes, this in turn allows the location of the mobile device to be determined in more absolute terms, e.g. relative to the globe or a map or floorplan.\nAnother localization technique is to determine the location of mobile device based on a \u201cfingerprint\u201d of a known environment. The fingerprint comprises a set of data points each corresponding to a respective one of a plurality of locations throughout the environment in question. Each data point is generated during a training phase by placing a wireless device at the respective location, taking a measurement of the signals received from or by any reference nodes within range at the respective location (e.g. a measure of signal strength such as RSSI), and storing these measurements in a location server along with the coordinates of the respective location. The data point is stored along with other such data points in order to build up a fingerprint of the signal measurements as experienced at various locations within the environment. Once deployed, the signal measurements stored"} +{"output_text": ", the blood meal, and second, the NPF signal.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a high-density interconnection structure.\nIn recent years, the degree of integration of semiconductor devices has been increased, and the size of a circuit element has been reduced. In order to realize a high-density interconnection structure, it is necessary to reduce the size of a circuit element. However,", "input_text": "d Histochemistry 82:9-18). Partial purification of NPY peptides in insects suggests that both NPY and NPF are synthesized in insects (Duve et al., 1981 xe2x80x9cIsolation and Partial Characterization of Pancreatic Polypeptide-like Material in the Brain of the Blowfly alliphora vomitoriaxe2x80x9d Biochem. J. 197, 767-770).\nResearchers have recently isolated two neuropeptides with NPF-like immunoreactivity from brain extracts of the Colorado potato beetle. The researchers purified the peptides using C18 reversed phase high pressure liquid chromatography (HPLC), and determined their structure using mass spectrometry. The deduced structures of these peptides are: Ala-Arg-Gly-Pro-Gln-Leu-Arg-Leu-Arg-Phe-amide (SEQ ID NO. 1) and Ala-Pro-Ser-Arg-Leu-Arg-Phe-amide (SEQ ID NO. 2) designated NPF I and NPF II, respectively (Spittaels etal., 1996).\nThe present inventors have surprisingly discovered that NPF adversely affects TTLE biosynthesis in the midgut of female Aedes aegypti fed a blood meal and injected with NPF polypeptide. Because the structure of NPF is different from TMOF it appears that NPF does not bind to a TMOF-specific binding site on the gut receptor but to a different site on the same or different receptor. Furthermore, cytoimmunochemical analysis, by the inventors, of the mosquito gut after the blood meal, using antiserum against NPF, has surprisingly revealed that exocrine cells with NPF-like molecules that are synthesized by mosquito epithelial cells 24 hours after a blood meal. NPF therefore appears to be a secondary signal in a cascade of signals: first"} +{"output_text": " are required to be fitted into the groove 21 and the groove 21' of the driven shaft 16 and the driven shaft 16', respectively.\nIn addition, the retaining ring 24' is required to be fitted into the groove 21' of the driven shaft 16' after the groove 21' of the driven shaft 16' has been fitted into the groove 21 of the rotary shaft 19.\nIn other words, the groove 21 of the rotary shaft 19 and the groove 21' of the driven shaft 16'", "input_text": " is integrally formed with stepped portions 20 and a groove 21, respectively, as a fixing portion or portions and a straight ridge portion 22 as a fixing portion for positioning the rotor 17 in a circumferential direction.\nIn addition, the inner surface of the rotary shaft 19 is integrally formed with a fitting face 19a to be fitted onto the so-called D-cut face 23 of the driven shaft 16 so that the rotary shaft 19 will never rotate relative to the driven shaft 16.\nNumeral 24 is a retaining ring to be fitted into the above-mentioned groove 21 for positioning the hub 6 in an axial direction.\nThe modified conventional electromagnetic clutch shown in FIG. 1 and explained above, however, still has several drawbacks as mentioned below.\n(1) The rotor 17 can be secured, by virtue of the straight ridge portion 22, with respect to the rotary shaft 19 in the direction of rotation, so that these two members always rotate integrally. However, since they are fixed through the aforesaid straight ridge in axial direction, when the rotor 17 shifts toward the other coupling members 6 and 9, it is liable to be accompanied by an undesired rotation of the rotary shaft 19 following the rotation of the rotor 17.\n(2) If the length of the magnetic pole 18 of the rotor 17 is reduced, the rotor 17 would become unstable, since the rotor 17 is liable to incline.\nIn addition, due to the fact that it necessitates magnetic pole 18, the cost for making the rotor 17 become inevitably expensive.\n(3) Two retaining rings 24 and 24' shown in FIG. 1 bring about lengthy man hours in setting work and high production cost.\nIn other words, both the retaining ring 24 for preventing the hub 6 from escaping from the rotary shaft and the other retaining ring 24' for preventing the rotary shaft 19 from its slipping out from the driven shaft 16,"} +{"output_text": " in a row in the direction of the rotor axis.\nThe circuit board 27 is connected to the stator coil 16 by a lead terminal protruding from the other side of the diodes 23, 25, 23N, and 25N. The circuit board 27 is connected to the stator coil 16 by a lead terminal protruding from the other side of the diodes 23, 25, 23N, and 25N.\nThe circuit board 27 is connected to the stator coil 16 by a lead terminal protruding", "input_text": " outside through a bracket exhaust port similar to the case of the rear bracket 2 side.\nAs shown in FIG. 14, the rectifier device 12 comprises the positive-pole-side heat sink 24 for supporting a plurality of positive-pole-side diodes 23 and a neutral-point diode 23N, a negative-pole-side heat sink 26 for supporting a plurality of negative-pole-side diodes 25 and a neutral-point diode 25N, and a circuit board 27 for electrically connecting anodes of the diodes 23 and 23N, cathodes of the diodes 25 and 25N, and the stator coil 16. A three-phase voltage generated in the stator coil 16 is full-wave-rectified by the rectifier device 12 having the diodes 23 and 25.\nAs shown in FIGS. 16 to 18, the rectifier device 12 including a heat sink and the like is configured so that the entirety thereof forms an approximate horseshoe shape. This configuration is used to store components by efficiently using the space in the generator because the shape of the generator is roughly cylindrical. As shown in FIGS. 16 and 17, the heat sinks 24 and 26 are made of aluminum or the like so as to radiate the heat produced by the diodes 23, 25, 23N, and 25N.\nThe heat sink 24 has a plurality of fins F protruding toward the rotor axis from the back of a joint surface between the positive-pole-side diodes 23 and 23N. Moreover, the heat sink 26 for joining the negative-pole-side diodes 25 and 25N is directly connected at its back side to the rear bracket 2 as a ground.\nThe diodes 23, 25, 23N, and 25N are molded, roughly rectangular and connected to the circuit board 27 by lead terminals protruding from one side of the diodes. The diodes 23, 25, 23N, and 25N are arranged"} +{"output_text": ". The IC is encapsulated in a plastic body and the contacts are exposed on the surface of the plastic body. The plastic body is then molded into a smart card.\nThe smart card is inserted into a smart card reader. The smart card reader is a device that reads the information stored in the smart card and provides the information to a host computer. The smart card reader is a device that reads the information stored in the smart card and provides the information to a host computer. The smart card reader is", "input_text": " connected between the underside of the light duty trailer and the spring member, and a limiting assembly, preferably a double-hinged spring shackle assembly, for connecting a rear end of the spring member to the underside of the light duty trailer. The front end of the spring member is pivotally connected to the tower. The rear end of the spring member is pivotally connected to the limiting assembly. The axle is connected to the spring member. A shock absorber is provided between the underside of the light duty trailer and the axle. In operation, if no bumps or ruts are encountered, the spring member remains generally horizontal relative to the underside of the light duty trailer. If one or both of the tires encounter a bump in the road, the rear end of the spring member moves upwardly relative to the underside of the light duty trailer and the air spring and the shock absorber compress. If one or both of the tires encounter a rut in the road, the rear end of the spring member moves downwardly relative to the underside of the light duty trailer and the air spring and the shock absorber expand. Smart cards are plastic cards having an embedded Integrated Circuit (IC). That IC may be a logic circuit with its associated memories or a microcontroller with its associated memories and software, or a microcontroller with its associated memories and software coupled to a custom circuit block or interface.\nTo use the computing power of the IC, a smart card makes use of a full set of packaging technologies. For example, the die size varies from 1 mm2 to 30 mm2, but is limited because of the mechanical limitations imposed by the plastic construction of the smart card. The IC is attached to a lead frame and wire-bonding techniques are used to connect the IC pads to the lead frame contacts. Potting or other strengthening methods can be used to protect the IC against chemical and mechanical stresses"} +{"output_text": " of a particular service when the monetary value represented by the monetary balance stored on the removable memory storage device exceeds a predetermined threshold.\nIn this manner, the proprietor of the health monitoring device may be able to collect revenue for the performance of particular services based on the monetary value represented by the monetary balance stored on the removable memory storage device. For example, the proprietor of the health monitoring device may be able to collect revenue for the performance of a particular service when the monetary value represented by the monetary", "input_text": " to individual patients at little or no cost, with the sale of proprietary test strips providing a major source of revenue for the proprietor of the health monitoring device. As noted previously, this may be a desirable business model for deploying the devices because it minimizes the initial cost that an individual patient must pay to begin using the device. Having to sell each device at its full cost, on the other hand, would undermine the economic feasibility of using the device in many contexts.\nNevertheless, it may also be desirable to provide a health monitoring device that does not rely on the sale of proprietary test strips as a major source of revenue. For example, the health monitoring device may be adapted to read non-proprietary test strips, or may incorporate a reusable and/or non-invasive testing device, such as an electrode, blood pressure monitoring device, sonic testing device, thermometer, saliva testing device, optical testing device, and the like. Of course, a non-invasive multi-use testing device may be used many times without affording the proprietor of the health monitoring device an opportunity collect revenue associated with each use of the device.\nTo provide an opportunity for the proprietor of the health monitoring device to collect revenue based on use of the device, the removable memory storage device may be utilized as a type of xe2x80x9cdebit cardxe2x80x9d or payment source for use with the health monitoring device. That is, the removable memory storage device may be purchased with a monetary value, or it may have a monetary value that is replenishable over the Internet using a bank credit or debit card or other conventional payment source. The health monitoring device may then deduct the cost of performing particular services from the monetary value represented by the monetary balance stored on the removable memory storage device. In other words, the health monitoring device may be configured to activate for the performance"} +{"output_text": "-mounted displays; and the like.\nThe present invention is directed to a method for the production of a polymer particle containing liquid crystal material, and to a polymer particle containing liquid crystal material produced by the method.\nThe present invention is also directed to a polymer particle containing liquid crystal material produced by the method of the present invention.\n2. Description of the Related Art\nThe present invention relates to a method for the production of a polymer particle containing liquid crystal material, and to a polymer particle", "input_text": " The liquid material may also be a solution of a material which is normally a solid at room temperature. One or more materials may be used in combination with, or in place of, a liquid crystalline material. As used herein, the term \"organic liquid\" includes reagents, adjuvants, and other chemically or biologically active species. Examples include inks, toners, dyes, flavors and fragrances. Other examples include biocides such as pesticides, herbicides, mildewcide, insecticides and fungicides, marine anti-fouling agents, pharmaceutically acceptable agents, and the like. The organic liquids used in this manner according to the present invention may be pure liquids, mixtures or solutions of solid or liquid species in organic solvents. The organic liquid may be removed by evaporation, for example during film formation, leaving a void, or air or another gaseous material, within the particle.\nAlternatively, material contained within the droplets may be inorganic or partially inorganic in nature, or may be comprised of precursors of inorganic species. For example, appropriately functionalized organic species could be chemically, or otherwise, converted to inorganic salts or complexes while in the droplet. Such appropriately functionalized organic species could themselves be part of a mixture or solution with one or more additional liquid or solid species. Complexes of organic ligands with metals may also be incorporated into the droplets. As discussed herein, the method of the present invention, is particularly useful in forming uniformly sized polymer particles containing liquid crystal material.\nApplications for liquid crystals include: computer display screens; wristwatches; architectural windows; privacy windows; automotive windows; automobile sunroofs; switching devices such as for optics systems, projection display devices; reflective display devices; hand-held paging devices; cellular phones; laptop computers; television screens including car-mounted television screens; automotive displays including radio, dashboard, and on-board navigation systems; helmet"} +{"output_text": "\nThe overlap writing method is a method of apodizing a fiber grating by overlapping a phase mask and a fiber grating. The overlap writing method is advantageous in that it can be applied to a short fiber grating. However, the overlap writing method has a limitation in that it is difficult to apply to a long fiber grating.\nThe use of a PZT is a method of apodizing a fiber grating by applying a voltage to a PZT. The use of a PZT", "input_text": "As the data transmission capacity of a WDM (Wavelength Division Multiplexing) system increases, channel spacing gets narrower. Therefore, there is an increasing need for optical filters that have a narrow bandwidth and excellent adjacent channel isolation characteristics.\nFiber gratings satisfy the requirements of such optical filters, i.e., low loss, low polarization dependence, and high channel selectivity. Further, the cost effectiveness of the fiber gratings makes them popular as optical filters.\nWhen a general fiber grating is fabricated in a conventional method using an excimer laser and a uniform phase mask, the refractive index of the fiber is constant over the length of the grating. In such a fiber, however, a sidelobe occurs and as a result, no apodization is achieved at the fiber grating. This sidelobe can be reduced by apodizing the fiber grating such that the magnitude of a refractive index variation is decreased toward the ends of the fiber grating.\nAn apodized fiber grating refers to a fiber grating of which the refractive index increases or decreases toward the center or both ends. The apodized fiber grating shows minimized sidelobes in both a short wavelength band and a long wavelength band. Although this apodization is effective in reducing a sidelobe in a longer wavelength band, it has limitations in reducing a sidelobe in a shorter wavelength band due to self-induced chirping of a fiber grating.\nThe self-induced chirping is attributed to an inconstant average refractive index of the fiber grating. Accordingly, the average refractive index should be made constant with respect to grating length in order to reduce a sidelobe which arises from the self-induced chirping.\nOther conventional fiber grating apodizing methods besides the conventional method discussed above include overlap writing, use of a PZT (Piezo Transducer), optical scanning, and use of a spatial filter."} +{"output_text": " molecule encoded by a gene.\nA xe2x80x9cgenexe2x80x9d is a nucleic acid molecule which encodes a gene product.\nA xe2x80x9cgenomic sequencexe2x80x9d is a nucleic acid sequence which is derived from a genomic DNA sequence.\nA xe2x80x9cgenomic DNA sequencexe2x80x9d is a DNA sequence derived from a genomic DNA sequence.\nA", "input_text": ", for example, a recombinant DNA which is incorporated into a vector, e.g., into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other DNA sequences. Substantially pure DNA also includes a recombinant DNA which is part of a hybrid gene encoding additional P. aeruginosa DNA sequence.\nA xe2x80x9ccontigxe2x80x9d as used herein is a nucleic acid representing a continuous stretch of genomic sequence of an organism.\nAn xe2x80x9copen reading framexe2x80x9d, also referred to herein as ORF, is a region of nucleic acid which encodes a polypeptide. This region may represent a portion of a coding sequence or a total sequence and can be determined from a stop to stop codon or from a start to stop codon.\nAs used herein, a xe2x80x9ccoding sequencexe2x80x9d is a nucleic acid which is transcribed into messenger RNA and/or translated into a polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a translation start codon at the five prime terminus and a translation stop code at the three prime terminus. A coding sequence can include but is not limited to messenger RNA, synthetic DNA, and recombinant nucleic acid sequences.\nA xe2x80x9ccomplementxe2x80x9d of a nucleic acid as used herein refers to an anti-parallel or antisense sequence that participates in Watson-Crick base-pairing with the original sequence.\nA xe2x80x9cgene productxe2x80x9d is a protein or structural RNA"} +{"output_text": " U.S. Pat. No. 5,437,811, a cholesteric liquid crystal display is disclosed in which the cholesteric liquid crystal is stabilized by polymerizing a polymerizable monomer in the liquid crystal. The polymerizable monomer is polymerized in the liquid crystal by applying a voltage to the liquid crystal. The polymerized monomer is then cross-linked to form a polymer network in the liquid crystal. The polymer network is formed by polymerizing the monomer in the liquid crystal.", "input_text": " aligned vertically, the incident light is decomposed into its right and left circular components with one component reflected and the other transmitted. The unique ability of a cholesteric liquid crystal to reflect light comes from their helical superstructure. The central reflected wavelength (\u03bbo) in a direction normal to the surface can be described as \u03bb0= n\u00b7p= n\u00b7(C\u00b7HTP)\u22121, where p is the helical pitch, in which the director rotates 360 degree, n is the average refractive index of the liquid crystal, C is the concentration of chiral dopant and HTP is the helical twisting power of the chiral material. The bandwidth (\u0394\u03bb) of the reflected light equals \u0394n\u03bb/ n, where \u0394n is the birefringence of liquid crystal and n is average of refractive index. A continuous tunable and electrically programmable optical filter based on cholesteric liquid crystal can be fabricated for filtering different spatial wavelength. The bandpass filters can achieve 100% transmission or reflection when a combination of two cholesteric filters with the same reflection wavelength and opposite handedness are stacked.\nWhen the helical pitch of a cholesteric liquid crystal is adjusted to Bragg reflect in the visible spectrum, it reflects an iridescent color. Depending on the magnitude of an applied voltage, the cholesteric liquid crystal in an electro-optical cell can be switched to different optical states such as the planar to focal conic and planar to homeotropic in which the incident light is weakly scatted or totally transmitted, respectively. The cholesteric cell displays an image which can remain on a display permanently without an applied voltage. This memory phenomenon can be achieved either by using surface treatment or polymer stabilization, as detailed, e.g. in U.S. Pat. Nos. 5,437,811, 5,691,795 and 5,695,682\nFor example, in"} +{"output_text": " the threshold value is set to a value that is higher than the effective value of voltage of the AC signal Vs.\nIf leakage occurs in the power supply system 10, the effective value of voltage measured by the voltage measuring section 40 is lower than the effective value of voltage of the AC signal Vs outputted from the oscillator 21, and the threshold value is set to a value that is lower than the effective value of voltage of the AC signal Vs.\nThe leakage detection section 20 detects", "input_text": " elements) 76 and six IGBT circuits 70-75 having corresponding six diodes 77.\nWhen the AC motor 15 is a three-phase motor, three sets of circuits, the IGBT circuits 70, 73, the IGBT circuits 71, 74 and the IGBT circuits 72, 75, are connected in parallel. Additionally, an intermediate point M1 between the IGBT circuits 70, 73, an intermediate point M2 between the IGBT circuits 71, 74 and an intermediate point M3 between the IGBT circuits 72, 75 are respectively connected to three coils in the AC motor 15.\nThe leakage detection section 20 comprises a capacitor C that is connected to a voltage applying point P on the positive line 13 connected to the positive side of the battery, a resistance R that is connected to the capacitor C, an oscillator 21 that generates an AC signal Vs with a prescribed frequency such as a sine wave or a square wave and provides the AC signal Vs to the resistance R, and a voltage measurement section 40 that measures a voltage level (effective value of the AC voltage) at a voltage measurement point Q located between the resistance R and the capacitor C. While the voltage measurement section 40 measures the voltage level, a threshold value is set to determine whether or not the leakage exists.\nA process of detecting the leakage in the leakage detection section 20 shown in FIG. 6 is performed as follows.\nIt is assumed that the insulation of the negative line 14 becomes deteriorated and leakage occurs therein.\nThe AC signal Vs outputted from the oscillator 21 passes through the resistance R and the capacitor C, and is applied to the applying point P on the positive line 13.\nIf no leakage exists in the power supply system 10, the effective value of voltage measured by the voltage measuring section 40 is substantially the same as the effective value of voltage of the AC signal Vs outputted from the oscillator 21, and"} +{"output_text": " process-layout interactions are minimized.\nIn addition, the design of CAM layouts is further complicated by the fact that CAMs are typically used in a variety of applications. For example, CAMs are used in cache memory applications, where the CAM is used to store the address of the most recently used data. In this case, the CAM is used to store the address of the data that is most recently used. In another example, CAMs are used in network routers, where the CAM is", "input_text": " 2TCAMBL1/BL1BL2/BL2Cell Value010100110Don't care1001Not used10101As discussed previously, a binary CAM has only two values and will either match or not match against the hitlines. In the ternary CAM, there are four possible combinations of values shown by the bitline, although one of the possible values, with both BL1 and /BL2 high, is not used. When both BL1 and /BL2 are low, this signals a \u201cdon't care\u201d value, which will show as a match against any value. This allows some portions of a pattern to be ignored while other portions are compared.\nProblems Encountered in Designing CAM Layouts\nDue to the increasing use of large CAM memories in System-on-Chip (SoC), it is necessary that a high density of such memories be achieved while still delivering the highest possible performance. The proper design of unit binary or ternary CAM bitcells is, therefore, of great importance in order to optimize chip density and performance.\nIn addition, with the advent of more advanced process technologies, such as the 90 nanometer (nm) process, the yield and performance of CAMs are much more sensitive to process-layout interactions as compared to previous technologies. For example, it is known that mechanical stresses that are induced in shallow trench isolation (STI) directly impact the performance of transistors due to the transfer of these stresses to the metal-oxide semiconductor (MOS) channel region. The negative effect of stress on transistor performance is directly dependent on the distance between the STI edge and the transistor. SRAM cells, owing to tighter design rules, are generally more affected by such process-layout interactions. In order to be used in advanced process technologies such as the 90-nm process, the layouts of binary and ternary CAM cells need to be such that these"} +{"output_text": ".\nThe present invention is directed to a method of making a reflective optical element comprising the steps of:\n(a) providing a lens possessing at least one optically shaped surface a portion of which is suitable as an adhesive substrate on which is disposed a lens film comprising an adhesive layer and backing selected to maintain a uniform thickness when deformed, a metal foil and metallized polymer layer possessing a uniform light reflecting surface and of a thickness selected to be flexible when deformed, wherein said lens film is applied and secured", "input_text": " those with normal vision are selected for style and purchased for use without delay for laboratory work. Thus the delay and inconvenience of sending eyeglasses to a mirroring laboratory is a hindrance to making glasses mounted display devices readily available and widely acceptable.\nFor spectacle mounted displays with large numerals such as a digital clock, the precision and cost of masking and sputtering a protected aluminized coating onto a glasses lens surface has proven to be a costly and overly precise optical response. While an aluminized coating may provide an excellent optical image of the object, in this case four numerals, such precision of image is unnecessary for the user to be able to discern the time correctly.\nIn order to overcome the shortcomings of the current condition, the present invention is directed to a reflective optical element comprising a lens possessing at least one optically shaped surface a portion of which is suitable as an adhesive substrate on which is disposed a lens film comprising an adhesive layer and backing selected to maintain a uniform thickness when deformed, a metal foil and metallized polymer layer possessing a uniform light reflecting surface and of a thickness selected to be flexible when deformed, wherein said lens film is applied and secured to said lens with pressure and stress to conform to said optically shaped surface and adhere said adhesive backing to said lens substrate. Said light reflecting surface is selected and positioned within the optical train of a spectacle mounted display and viewer. In some embodiments of the present invention, the lens film is partly transparent. In the preferred embodiment of the current invention as a spectacle mounted display, the resulting image quality is suitable for the display of a limited quantity of text.\nIn the prior art, reflective tape was considered to be too imprecise a medium for functioning as an optical element to generate a recognizable image. Although metallized film is readily available as flat mirrors for decorative purposes, spherical, cylindrical or aspheric shapes in small sizes lens were considered unlikely to produce acceptable images"} +{"output_text": " the lateral sides of the first and the second receiving chambers respectively, the vertically contacting pin set and the horizontally contacting pin set can contact respectively with the silicon disks with different specifications inserted into the first and the second receiving chambers, thereby, the read/write signal transmission module for silicon disks of the present invention can suit all kinds of silicon disks.\nThe present invention will be apparent in its novelty and features after reading the detailed description of the preferred embodiment thereof in reference to the accompanying drawings.", "input_text": " as PC ATA cards or CF cards as shown in FIG. 5 depicting a silicon disk Axe2x80x2), thereby, the contact pin sets 6 provided on the upper and the lower electric circuit boards 2 can not contact or connect with such a silicon disk 5, they do not suit all kinds of silicon disks.\nTo solve the above stated problems and to render a read/write signal transmission module for silicon disks to suit all kinds of silicon disks, the inventor of the present invention reconsider the above stated primary structure concentrating on designing of the different specifications, and developed the multi-specification read/write signal transmission module for silicon disks of the present invention after nonstop study and tests, with the module, all known silicon disks with different specifications can be inserted and read/written.\nIn particular, the present invention is provided on the center of a xe2x80x9cUxe2x80x9d shaped base with a partitioning plate, an upper and a lower lid, so that the base is formed therein a first and a second receiving chamber. The base is provided respectively on the lateral arms at the lateral sides of the first and the second receiving chambers with a plurality of guide recesses, thereby, different kinds of silicon disks with different specifications can be inserted therein. The first receiving chamber is provided therein at least with a vertically contacting pin set, and the second receiving chamber is provided therein at least with a horizontally contacting pin set, in order that when silicon disks with different specifications are individually inserted into the first or the second receiving chamber, they can contact respectively with the vertically contacting pin set or the horizontally contacting pin set. In this way, the module of the present invention can suit all the silicon disks with different specifications in the markets.\nThe primary object of the present invention is that, by the fact that the above stated vertically and the horizontally contacting pin sets are allocated at"} +{"output_text": " of the laser spot on the cantilever. The laser spot position is determined by the laser beam focusing optics, which is usually a lens. The lens is usually a spherical lens, which is not suitable for the detection of nanomechanical cantilever sensors. The spherical lens has a focal length of about 1 mm, which is too long for the detection of nanomechanical cantilever sensors. The spherical lens is also not suitable for the detection of nanomechanical cantilever sensors because the spherical lens", "input_text": " it is due to the interaction with target species or not. This occurs because the specificity and selectivity of nanomechanical sensors comes only from immobilized molecules or receptors. The mechanical deformations of the cantilevers do not directly relate to molecular structure or property, even if the peak shape (ascending side and descending side) contains some specific molecular information. Although spectroscopic techniques (infrared and visible spectrometry, mass spectrometry, and nuclear magnetic resonance) could provide detailed information for molecular identification, adding scanning wavelength spectroscopic accessories to portable sensors increases their size, complexity, and expense, while reducing adaptability of the devices. Because one of the most important driving forces of creating a small sensor is to achieve wide and easy deployment as the result of compact size, simple structure and low price, combining the device with one spectroscopic method is thus not preferred.\nSecondly, for most of the biomolecular detection applications, cantilever sensors have to take an extra step (e.g., heat treatment or stringent washing) to remove attached molecular species for the next detection. In other words, the sensor needs to recognize and detect molecules with adjustable sensitivity over a large concentration range call for an adaptive and reconfigurable sensing strategy.\nAnother limitation of nanomechanical cantilever sensors is the structural configuration wherein the sensor must have rectangular cross sections with thickness at sub-micrometer scale, and require that the top surface and the bottom surface be made of different material or be chemically modified. Usually a gold thin film is deposited on one side of the silicon substrate to reflect a laser beam and immobilize receptors. Such asymmetric bimorph thin film structures are sensitive to temperature change. The temperature must be controlled precisely using a thermoelectrically stabilized cell and stabilized for hours before detection, otherwise the specific binding signal cannot be differentiated from thermal noise.\nAnother limitation of the nanomechanical cantilever sensor is that during laser deflection detection, accurate bending measurements depend on the position"} +{"output_text": " a second animal. The second animal is then exposed to the venom. The serum of the second animal is then collected and used to immunize a third animal. The third animal is then exposed to the venom. The serum of the third animal is then collected and used to immunize a fourth animal. The fourth animal is then exposed to the venom. The serum of the fourth animal is then collected and used to immunize a fifth animal. The fifth animal is then exposed to the venom. The serum", "input_text": "oms are too difficult or too expensive to obtain to immunize a population where a relatively small percentage of that population will be exposed to the animal venom. Second, even if they can be obtained, animal venoms, unless detoxified, may cause more morbidity when administered to a large population than would be caused by the venomous animals themselves. Third, even if the venom is affordable, obtained in sufficient quantity, and detoxified, it is extremely difficult to achieve the titer of circulating antibody necessary to neutralize the infusion of what can be a large amount of venom (up to one gram of animal venom as compared with nanogram or picogram amounts of tetanus toxin). Finally, even with successful immunization, immunological memory is too slow to respond to the immediate crisis of envenomation.\nAlthough active immunization with venoms has the above-named problems, some investigators have chosen to pursue research in this area rather than in the area of passive immunization, arguing that passive immunization is too long and expensive. These investigators have made some progress in the method of immunization by using liposomes. R. R. C. New et al., New Eng. J. Med. 311 56 (1984). T. V. Freitas et al., Toxicon 27:341 (1989).\nB. Passive Immunization\nBecause the problems with active immunization have not been overcome, the only treatment available for venoms is passive immunization. Passive immunization, like active immunization, relies on antibodies binding to antigens. For our purposes here, antitoxin refers to antibody raised against a single toxin. Antivenom refers to.antibody raised against whole venom.\nIn the case of passive immunization, the antibody used to bind the venom (antigen) is not made in the animal afflicted with the venom. Generally, an immune response is generated in a first animal. The serum of the first animal is then administered to"} +{"output_text": " small-diameter rods are flexed at right angles to the axes of the support and the cylindrical portion of the vibrator, so that the small-diameter rods are subjected to a large bending moment. This causes the small-diameter rods to be subjected to a large bending stress, and the small-diameter rods are liable to be broken.\nThe present invention has been made in view of the above-mentioned problems of the prior art, and it is an object of the present", "input_text": "ator and a forward end of the support to connect them together in such a manner that the small-diameter rods are arranged in a circle concentric with the support and the cylindrical portion of the vibrator, and a bellows of resilient material mounted between the forward end of the support and a back surface of the conical forward end portion of the vibrator. The vibrator has a rotary shaft journalled in the cylindrical portion thereof coaxially therewith, and an eccentric weight is mounted on the rotary shaft which is connected to a drive motor. A pipe laying apparatus equipped with this type of vibratory excavator offers the same advantages as the first-mentioned pipe laying apparatus of the vibration excavation type of the prior art.\nMoreover, the vibrator excavator noted hereinabove is equipped with shock absorbing small-diameter rods interposed between the support and vibrator. This causes a thrust applied to the pipes to be laid by hydraulic cylinders to be transmitted to the vibrator, while, the vibration of the vibrator (vibration which is at right angles to the axes of the support and the cylindrical portion of the vibrator) is absorbed by the small-diameter rods which are flexed at right angles to the axes of the support and the cylindrical portion of the vibrator, so that the vibration of the vibrator is prevented from being transmitted to the pipes to be laid through the support. Stated differently, the vibrator and the pipes to be laid are connected together in flexible coupling through the small-diameter rods and support. This makes it possible to provide improvements in the first-mentioned vibratory excavator of the prior art which suffers the disadvantage that the excavator should be large in size to develop a vibration of high magnitude due to the vibration to be transmitted to the pipes to be laid.\nThe problem raised with regard to this type of vibratory excavator is that the"} +{"output_text": ", but not in another. The authors state that the venom is a complex mixture of proteins and peptides, and that the analgesic activity is due to the action of a single component.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a high-speed operation and a high-density integration.\nIn recent years, a semiconductor device having a high-speed operation and a high-density integration has been developed. In order", "input_text": " possesses it\"\"s own characteristic venom composition. To date, only a few hundred compounds from some 400 venomous snake species have been characterized. These include enzymes, toxins, growth factors, etc. Most of the isolated venom compounds are of unknown function.\nTraditionally, snake venom is considered a source of toxic substances. However, it is also a source of analgesics. Doctors who treated patients bitten by a South American snake (Crotalus durissus terrificus) reported that although these patients were in a life-threatening condition, they felt no pain. A neurotoxin product isolated from snake venom was regarded as a new type of analgesic at the First Congress of Neurotoxicology (1977) in Yugoslavia. These and other observations lead to attempts to isolate anesthetic compounds from snake venom.\nBevan, P. and Hiestand, P. (1983) J. Biol. Chem. 258:5319-5326 describe a single chain polypeptide isolated from Vipera russelli russelli venom by cation exchange chromatography. The polypeptide competes with the binding of monoamines and opiate ligands to their respective receptors, and injection of the polypeptide intracerebroventricularly in rats causes marked sedation. The authors state that the polypeptide is a large and highly charged molecule which is unlikely to pass the blood-brain barrier. The polypeptide was found to be a moderately potent toxin, similar to the crude venom.\nDutta, A. S. and Chaudhuri, A. K. N. (1991) Indian J. Exp. Biol. 29:937-942 describe experiments carried out with crude venom of Vipera russelli on mice and rats. The venom was injected intraperitoneally and intravenously, and was found to produce alterations in general behavior patterns connected with the CNS. The venom showed significant analgesic activity in one assay"} +{"output_text": " provide a relatively uniform distribution of air flow over the pile. The air flow is usually provided by a fan or the like. The air flow is directed upwardly through the pile and is discharged through the top of the pile. The air flow is usually sufficient to remove the moisture from the coal. However, the air flow is not sufficient to remove the moisture from the coal. The moisture is removed by the action of the air on the coal. The air flow is not sufficient to remove the moisture from", "input_text": "es or other hydrocarbon materials. Reference is made, e.g., to U.S. Pat. No. 1,960,917 (Process for treating coal), U.S. Pat. No. 2,197,792 (Coal spraying chute), U.S. Pat. No. 2,204,781 (Art of protecting coal and like), U.S. Pat. No. 2,610,115 (Method for dehydrating lignite) and U.S. Pat. No. 2,811,427 (Lignite fuel). U.S. Pat. No. 3,961,914 (Process for treating coal to make it resistant to spontaneous combustion) disclosed a silicon dioxide film on the coal surface. The entire disclosure of each of these United States patents is hereby incorporated by reference into this specification. Without wishing to be bound by any particular theory, applicants believe that favorable altering of the surface components reduces the reactivity and oxidation.\nOther methods have used application of oxidizing agents or treatment with high temperature under pressure (U.S. Pat. No. 6,146,432 at column 2, lines 35-60). The entire disclosure of each of these United States patents is hereby incorporated by reference into this specification. Yet other processes use controlled drying in a manner that particle surface pores are self-sealed by hydrocarbon material evolved from the particles.\nOther approaches include the prolonged exposure of the coal to air, the use of oxidizing agents sprayed on coal, and treating the coal with high-temperature water under pressure. The coatings perform their work by covering the pores and limiting the access of active components of the air to active sites on the dried coal. U.S. Pat. No. 3,723,079 (Stabilization of coal) explains: \u201cFor example, coal piles are often arranged in a particular manner to"} +{"output_text": " the form of a perforated plate, is arranged in the inlet portion of the housing and is connected to a suction duct. The suction duct is connected to a suction fan which is arranged in the housing and which is connected to a suction duct. The suction fan is connected to a suction duct which is connected to a suction fan which is arranged in the housing and which is connected to a suction duct. The suction ducts are connected to a suction fan which is arranged in the housing and which is connected", "input_text": " here with the aid of pressure measurement outlets, the pressure drop across the component then being proportional to the square of the flow. The pressure measurement outlets are connected via hoses to a pressure-sensing means in a meter with a pointer for visual indication of the flow. This measurement method is however burdened with the disadvantage that comparatively poor measurement accuracy is obtained, partly due to the comperatively low flow rate in the ducts and partly due to practical installation difficulties.\nAnother method of flow determination is described in the Swedish patent specification SE-C-455 442 (AB Bahco Ventilation). In this case a filter in a central unit is exchanged for two perforated plates serving as constriction means, pressure sensors then being used to measure the different pressure drops which occur with the filter in place on the one hand, and the constriction plates on the other. On the basis of the flows which have already been measured with the constriction plates located in a similar unit, the flow rate is interpolated or extrapolated when the filter is in place, e.g. graphically with the aid of a diagram.\nThis method also gives comparatively poor measurement accuracy, and it cannot be used for continuous measurement during operation of the installation, at least not without considerable complications and work from personel.\nYet another known method is described in the published Swedish patent application SE-A-8701663-0 (Flakt AB), the pressure drop measurement being carried out on the suction side of a suction fan in a ventilation installation.\nThe fan is placed in an apparatus housing and on its pressure side is connected to a duct system. A constriction means is arranged in the inlet portion of the housing on the suction side of the fan and has two pressure tappings connected to a differential pressure measurement device for determining the pressure drop and the flow rate.\nThe constriction means, e.g. in"} +{"output_text": "\nThe present invention provides a method and apparatus for temporarily disabling a transponder, such as a sticker tag, that is permanently mounted to a surface. The apparatus includes a housing having a cavity therein, a first antenna disposed within the cavity, and a second antenna disposed within the cavity. The first antenna is configured to transmit a first signal to the transponder, and the second antenna is configured to transmit a second signal to the transponder. The apparatus also includes a switch disposed within", "input_text": " easier to distribute and have more capability than the previous tags constructed of conventional printed circuit boards and housed in a plastic case. Another advantage is that the \u2018sticker\u2019 tags are designed to be permanently mounted and thus provide more security from fraud by preventing tags to be moved from one vehicle to another. The disadvantage of not being able to move tags between vehicles is offset by the lower cost so that an individual tag can be issued to each vehicle economically. A disadvantage of the permanently mounted \u2018sticker\u2019 tags, however, is that they can't be temporarily disabled. For example, once removed from the windshield, the adhesive on a windshield sticker tag, such as that produced by Transcore, Inc., can be damaged. Since the antenna design in this tag relies on uniform close proximity to the glass for proper operation, the tag cannot be reused. Thus, sticker tags and any other permanently mounted transponder would be read every time it passed within the RF field of an applicable interrogation system, even when the user did not desire to have the RFID tag read, e.g., to disable the tag when it was desired to pay using cash or other means.\nRFID tags can be permanently disabled by mechanical destruction of the conductive patterns on the tag. An example is provided in U.S. Pat. No. 7,277,016 (Moskowitz et al.). While permanent tag disabling has certain viable applications, by definition it is unsuitable for applications where the tag is to be disabled only temporarily and so that it can be reused at some later time.\nAccordingly a need exists for a device and method to temporarily disable sticker or other permanently mounted tags with the ease and simplicity as has characterized the temporary disablement of hard cased tags (e.g., by removal and placement in a remote location or in a shielding pouch so that the tag could not be read)."} +{"output_text": " displacement equipment are each controlled by a separate button.\nThe pipette tips are usually detachably clamped on the spigots by means of a clamping device. The clamping device is usually a clamping screw which is screwed into a threaded bore of the spigot and which is tightened by means of a screwdriver. The pipette tip is clamped on the spigot by means of the clamping screw. The pipette tip is usually detachably clamped on the spigot by means", "input_text": " equipment with a drive device and an ejector. By actuating the drive device, the ejector is dislocated such that it detaches the pipette tip from the spigot without that the user must touch it. The drive device has often a mechanism which must be actuated by means of a button in order to detach the pipette tip from the spigot. Alternatively, the drive device has an electric motor which can be controlled by actuating a button in order to detach the pipette tip from the spigot. This applies in particular for manual pipettes, i.e. pipettes which are can be held and operated by the user with one or both hands in the utilization. In the embodiment as manually driven pipettes, these pipettes have a mechanism for the displacement equipment which is manually drivable by means of a dosing button, and in the embodiment as electronic pipettes an electric drive motor for the displacement equipment which can be controlled by means of an electric dosing button.\nDetaching a pipette tip from the spigot can necessitate a significant effort when a pipette tip is to be firmly clamped up on a spigot.\nMultichannel pipettes serve for picking up liquid from one or several vessels or to deliver into one or several vessels concomitantly. Multichannel pipettes are often used for the handling of microtiter plates, which have a plurality of vessels in a matrix-like arrangement. For this purpose, multichannel pipettes have several spigots, arranged parallel side by side in a row in the same height, whose through holes are each one connected to a separate displacement equipment or to a common displacement equipment. In adaptation to a frequently used format of microtiter plates with 96 (=8\u00d712) vessels, multichannel pipettes have frequently eight or twelve spigots. The several displacement equipments or the common"} +{"output_text": "ine, 4-tert-butylpyridine, diphenylpyridine, benzylpyridine, methoxypyridine, butoxypyridine, dimethoxypyridine, 1-methyl-2-pyridine, 4-pyrrolidinopyridine, 1-methyl-4-phenylpyridine, etc.), pyridazine derivatives, pyrimidine derivatives, pyrazine derivatives, pyrazoline derivatives, pyrazolidine derivatives, piperidine", "input_text": "methylaniline, 3-methylaniline, 4-methylaniline, ethylaniline, propylaniline, trimethylaniline, 2-nitroaniline, 3-nitroaniline, 4-nitroaniline, 2,4-dinitroaniline, 2,6-dinitroaniline, 3,5-dinitroaniline, N,N-dimethyltoluidine, etc.), diphenyl(p-tolyl)amine, methyldiphenylamine, triphenylamine, phenylenediamine, naphthylamine, diaminonaphthalene, pyrrole derivatives (ex. pyrrole, 2H-pyrrole, 1-methylpyrrole, 2,4-dimethylpyrrole, 2,5-dimethylpyrrole, N-methylpyrrole, etc.), oxazole derivatives (ex. oxazole, isoxazole, etc.), thiazole derivatives (ex. thiazole, isothiazole, etc.), imidazole derivatives (ex. imidazole, 4-methylimidazole, 4-methyl-2-phenylimidazole, etc.), pyrazole derivatives, furazane derivatives, pyrroline derivatives (ex. pyrroline, 2-methyl-1-pyrroline, etc.), pyrrolidine derivatives (ex. pyrrolidine, N-methylpyrrolidine, pyrrolidinone, N-methylpyrrolidone, etc.), imidazoline derivatives, imidazolidine derivatives, pyridine derivatives (ex. pyridine, methylpyridine, ethylpyridine, propylpyridine, butylpyridine, 4-(1-butylpentyl)pyridine, dimethylpyridine, trimethylpyridine, triethylpyridine, phenylpyridine, 3-methyl-2-phenylpyrid"} +{"output_text": " a particular Gxcex1 subunit.\nThe Gxcex2/Gxcex3 (Ste4p/Ste18p) particle is a heterotrimer of the products of the STE4, STE18 and STE20 genes. The Gxcex2/Gxcex3 particle is a Gxcex2/Gxcex3-specific G protein-coupled receptor. The Gxcex2/Gxcex3 particle is a heterotrimer of the products", "input_text": "omyces cerevisiae (20). Cells of the MATa mating type express a receptor encoded by the STE2 gene. This receptor becomes activated upon binding of the xcex1-factor mating pheromone, a peptide secreted by cells of the opposite (MATxcex1) mating type. The yeast G protein is assembled from the products of the GPA1 (Gxcex1), STE4 (Gxcex2), and STE18 (Gxcex3) genes. The Gxcex2/Gxcex3 (Ste4p/Ste18p) particle released upon activation of the Ste2p receptor conveys the signal to a mitogen-activated protein kinase (MAPK) module. This leads to activation of the cyclin-dependent kinase inhibitor Far1p, causing cell cycle arrest and transcriptional induction of a set of genes involved in the mating process, including FUS1. The pathway is desensitised by Sst2p, a member of the RGS family. Cells of the opposite mating type (MATxcex1) express a different receptor (Ste3p) and thereby respond to the pheromone (a-factor) secreted by MATa cells; otherwise the signalling apparatus utilised in the two mating types is the same.\nAt present, at least 16 Gxcex1 subunits, 5 Gxcex2 subunits and 11 Gxcex3 subunits have been identified in mammals, which can assemble a wide diversity of trimeric G proteins. On the basis of sequence homology, the Gxcex1 subunits fall into at least four families, related to Gxcex1i, Gxcex1s, Gxcex1q, or Gxcex112. Typically, a given 7TM receptor activates only a single or small subset of Gxcex1 subunits. Thus even in cells which express multiple Gxcex1 subunits, signalling may be specific to"} +{"output_text": "up of MTXPGs in the cells.\nThe polyglutamation of methotrexate is also thought to be important in the pharmacological effects of the drug. For example, the polyglutamation of methotrexate is thought to be important in the pharmacological effects of the drug in the treatment of rheumatoid arthritis (Angelis-Stoforidis et al., supra, (1999); Allegra et al., supra, (1985)). In addition, the polyglutamation of methotrex", "input_text": " sequential addition of glutamic acid residues enhances intracellular retention of methotrexate (Allegra et al., Proc. Natl. Acad. Sci. USA, 82:4881-4885 (1985)). The polyglutamation process is in competition with deconjugation by gamma glutamyl hydrolase (GGH) (Rhee et al., Mol. Pharmacol., 53:1040-1046 (1998); Yao et al., Proc. Natl. Acad. Sci. USA, 93:10134-10138 (1996); Panetta et al., Clin. Cancer Res., 8:2423-2429 (2002)), a lysosomal enzyme having high affinity towards long chain polyglutamates. (Masson et al., J. Clin. Invest., 97:73-80 (1996)).\nThe accumulation of MTXPGs is critical to the pharmacological effects of methotrexate. In vivo, the concentration of MTXPGs in lymphoblasts and erythrocytes appear to correlate with the therapeutic response to methotrexate in patients with leukemia (Dervieux et al., Blood, 100:1240-1247 (2002); Dervieux et al., Arthritis Rheum., in press, (2004)) or rheumatoid arthritis (Angelis-Stoforidis et al., Clin. Exp. Rheumatol., 17:313-320 (1999); Allegra et al., Proc. Natl. Acad. Sci. USA, 82:4881-4885 (1985)). Polyglutamation of methotrexate is thought to promote the sustained inhibition of de novo purine synthesis by 5-aminoimidazole carboxamide-ribonucleotide transformylase (ATIC) (Dervieux et al., Blood, 100:1240-1247 (2002); Allegra et al., supra, (1985)), thereby promoting the build-"} +{"output_text": " systems of the prior art have the same basic problem of overloading and underloading the bearings. The present invention solves this problem by providing a thrust balancing system which is responsive to the pressure of the gas at the inlet and outlet of the compressor and which is responsive to the pressure of the gas at the inlet and outlet of the compressor and which is responsive to the pressure of the gas at the inlet and outlet of the compressor and which is responsive to the pressure of the gas at the inlet and outlet of", "input_text": " 25, 1979 to Webb also correlates the biasing of the thrust balance pistons to the discharge pressure of the compressor. Webb uses a valve structure in the high pressure lubrication oil line to attenuate the pressure applied to the thrust balance piston so that it will be approximately 20 psi below whatever the compressor discharge pressure is. However, the basic problem of overloading and underloading is not solved.\nU.S. Pat. No. Reissue 32,055 issued Dec. 24, 1985 to Schibbye et al discloses that high pressure lubricating oil should be supplied to the thrust balance piston on the low pressure end of the male rotor; that a mean lubricating oil pressure should be applied to the high pressure ends of both the male and female rotors; and that an axial connection passage be provided from the high pressure end of the female rotor to the female rotor balancing piston at the low pressure end thereof to keep both ends at the mean pressure. Thus, the low pressure end of the male rotor is at a high thrust balancing pressure and the low pressure end of the female rotor is at a lower mean thrust balancing pressure to help increase service life of the bearings but does not fully address the problem of underloading and overloading the bearings.\nU.S. Pat. No. 4,964,790 issued Oct. 23, 1990 to Scott states that in the prior art \"the balancing pressure on the pistons is not responsive to the various operative parameters other than outlet pressure of the rotary compressor.\" Scott discloses a complex system using a microprocessor control for computing a net counterbalancing force in response to inputs or sensed parameters relating to the pressure of gas at the inlet and outlet of the compressor, and regulates a variable valve of an oil pump responsive to the microprocessor signal to control the amount of thrust balancing oil pressure applied to the counterbalancing pistons.\nAll of the thrust balancing"} +{"output_text": " model.\n2. Background of the Invention\nIn the design of digital systems, it is often necessary to simulate the operation of the system. This is particularly true in the design of digital systems that include a large number of components. For example, in the design of a digital system that includes a microprocessor, it is necessary to simulate the operation of the microprocessor.\nIn order to simulate the operation of a digital system, it is necessary to create a simulation model of the digital system.", "input_text": " drive trains. Such systems are complicated and more costly than standard drive shaft systems or chain drive systems, but are best suitable for large capacity systems and can provide maneuverability without damaging crops. For these systems, standard suspensions incorporating leaf springs may be used.\nFor smaller field application vehicles having 300-400 gallon capacity, chain drive systems may be used. Typically, these vehicles use narrow tires for driving in between the crop rows and carry application equipment that may expand over 3 to 4 rows. Suspension systems for these vehicles may be nonexistent or simply provided by deflating the vehicle tires to soften the ride.\nThe need arises, however, for a field application (or farm) vehicle which has a capacity for mid-size farms (i.e., a capacity between that for a small application vehicle and that for a large application vehicle) yet the farm vehicle must incorporate a drive system and suspension system which can operate safely within the operational environment utilizing components which fit within the economics of such farms. For example, existing farm vehicles fail to safely meet this need partly because the ground clearance of conventional farm vehicles is dependent on wheel diameter. Increasing wheel diameter to increase ground clearance would raise the farm vehicle's center of gravity to an unsafe height, making it especially prone to rollover on rough terrain.\nThus there is a need for a vehicle which can operate within a farm environment without damaging crops having a drive and suspension system capable of carrying a large quantity of field application material. 1. Technical Field\nThe present invention relates in general to designing and simulating digital devices, modules and systems in a distributed simulation environment. In particular, the present invention relates to a method and system that improve a distributed simulation environment to allow for efficient monitoring and utilization of instrumentation events embedded with a simulation model. More particularly, the present invention relates to a method and system for providing centralized access to count event information from testing of a hardware simulation"} +{"output_text": " al.\nThe coreless type is advantageous in that it is smaller in size and lighter in weight, and is thus used in a variety of applications.\nFIG. 1 is a cross-sectional view of a conventional coreless coupling. The coupling includes a pump impeller 1, a turbine runner 2, and a coupling body 3. The pump impeller 1 is rotatably supported by a shaft 4, and the turbine runner 2 is rotatably supported by a shaft 5. The coupling", "input_text": " the respective volumes at those moments, corresponding to the cardiac performance. By performing the transmission and receipt at such moments a very good correlation between the position of the respective cardiac segment, for instance the left ventricular walls, and the cardiac output is obtained. Particularly, the difference between minimum and maximum inner diameters of the left ventricle has a good correlation to the cardiac output, said correlation being taken advantage of by this specific arrangement of the device.\nAnother object of the present invention is to provide a rate responsive pacemaker system capable of estimating the cardiac performance, that is a cardiac output, and to use this information as a parameter in order to control its pacing or other operation.\nSuch a rate responsive pacemaker system contains a monitoring device according to the invention as previously described. 1. Field of the Invention\nThe present invention relates to a fluid coupling, and particularly to a fluid coupling without an inner core.\n2. Description of the Related Art\nA fluid coupling (hereinafter \"coupling\") operates to transmit power through fluid between a pump impeller and a turbine runner which are opposed to each other. A coupling does not function to increase torque, and thereby differs from a torque converter, but, rather, functions simply as a coupling for transmitting power. Since fluid couplings do not have a stator, they are smaller and light in construction, and thus have been used as starting devices in vehicles.\nCouplings are classified in accordance with whether or not they have an inner core for guiding the flow of fluid in the coupling. The \"core type\" has such an inner core and the \"coreless type\" lacks the inner core. An \"inner core\" is shown, for example, as member 8 in U.S. Pat. No. 5,005,356 issued to Saunders and as elements 10b in U.S. Pat. No. 4,866,935 issued to Hayabuchi et"} +{"output_text": " antenna system (DAS) and, more particularly, to a DAS that includes a plurality of remote antenna units (RAUs) that are coupled to a central unit via a backhaul link.\nWireless communication systems are widely deployed to provide various telecommunication services such as telephony, video, data, messaging, and broadcasts. Typical wireless communication systems may employ multiple-access technologies capable of supporting communication with multiple users by sharing available system resources (e.g., bandwidth, transmit power).", "input_text": " mixed with air.\nIf this air remains in the ink when the ink is returned to the ink gun, it will tend to disrupt the formation of the ink jet. For example, the ink may be at a pressure of approximately three times atmospheric pressure immediately before it leaves the ink gun through the jet-forming nozzle. At this pressure, any air mixed in with the ink will be substantially compressed. Immediately the ink leaves the nozzle, it will be exposed to atmospheric pressure. This pressure change will cause any air mixed into the ink to expand abruptly, disrupting the jet. Additionally, air bubbles can partially block the nozzle, which may make the ink jet become unstable or non-uniform, which in turn interferes with the break-up of the jet into drops so that the drops are incorrectly deflected. The incorrect deflection both results in incorrect printing and ink contamination of the printhead and/or the surface being printed onto. Partial blockage of the nozzle may also change the direction of travel of the ink jet, causing it to strike components of the printhead. Therefore it is desirable to ensure that the air that gets mixed into the ink as it returns to the ink tank is substantially separated out of the ink before the ink is returned to the ink gun.\nThe ink also tends to accumulate undesirable particulate matter, such as dried ink particles, dust, and the like. It is desirable to remove this particulate matter from the ink, since it may cause problems, for example by totally or partially blocking the nozzle of the ink gun. The ink may be passed through a filter to remove such material. However, under some circumstances air that is mixed into the ink may pass through the filter, especially if the air is in the form of very small bubbles, and so the filter cannot be relied on to prevent air from remaining in the ink that is returned to the ink gun. This invention relates generally to a distributed"} +{"output_text": " into a radiation irradiated space.\nFurthermore, in the fourth aspect of the present invention, the radiation tomographic imaging apparatus further comprises display means for displaying the tomographic images produced by the tomographic image producing means.\nA radiation tomographic imaging apparatus in a fifth aspect of the present invention comprises: a radiation tube for emitting radiation; a collimator capable of forming the radiation emitted by the radiation tube into a radiation beam to emit the radiation beam and capable of changing a range irradiated by the radiation", "input_text": " subject carried into a radiation irradiated space.\nFurthermore, in the third aspect of the present invention, the radiation tomographic imaging apparatus further comprises display means for displaying the tomographic images produced by the tomographic image producing means.\nA radiation tomographic imaging apparatus in a fourth aspect of the present invention comprises: a radiation tube for emitting radiation; a collimator capable of forming the radiation emitted by the radiation tube into a radiation beam to emit the radiation beam and capable of changing a range irradiated by the radiation beam in response to a first control signal; a detector element array comprising a plurality of radiation detector elements with their irradiated surfaces facing in an impinging direction of the radiation beam, in which array the radiation detector elements are arranged in one of two mutually perpendicular directions to form a detector element row, and a plurality of the detector element rows are arranged side by side in the other of the two mutually perpendicular directions; radiation tube moving means capable of moving an emission center of the radiation tube in the other of the two mutually perpendicular directions in response to a second control signal; data collecting means for collecting desired data by selecting or variedly combining detected signals from the detector element rows in the detector element array in response to a third control signal; control means for receiving radiation irradiated range information and outputting the first control signal to the collimator, the second control signal to the radiation tube moving means and the third control signal to the data collecting means corresponding to the information; and tomographic image producing means for producing multi-slice tomographic images of a region through which the radiation beam passes based on radiation detected signals for a plurality of views detected by the detector element array corresponding to the irradiated range information and collected by the collecting means.\nMoreover, in the fourth aspect of the present invention, the radiation tomographic imaging apparatus further comprises rotating means for rotating the radiation tube, collimator and detector element array around a subject carried"} +{"output_text": "096,220 and discusses a non-cylindrical chamber and a plasma source being located midway down the chamber. In addition it provides a method for maintaining a multi-species plasma at a low enough density such that collisions between the particles are relatively infrequent, and introduces one or more collectors positioned to intercept high mass particles.\nU.S. Pat. No. 6,235,202 is a continuation in part of U.S. Pat. No. 6,096,220 and", "input_text": " A plasma is thus comprised of charged particles\u2014generally positive ions and negative electrons.\nU.S. Pat. No. 6,096,220 (\u201cPlasma Mass Filter\u201d), a drawing from which is reproduced in FIG. 1, which is directed to a process and device 10 for filtering low mass particles from high mass particles in a plasma by means of injecting the plasma into a cylindrical chamber having a magnetic field aligned with the axis, and a perpendicular electric field so as to cause a rotational movement of charged particles in the chamber. The magnitude of the magnetic and electric fields are adjusted such that the high mass particles escape radially and collide with the cylindrical wall, while the low mass particles are confined to travel within the walls.\nThe general function of a filter is quite different from that of a separator. While the former generally requires only that all particles above a certain mass are trapped and all below such a mass pass through\u2014momentum resolution is not a critical design or performance issue. The latter must cleanly separate and collect specific particles that represent the ionic constituents of a particular metal product. Moreover, it is often the case that there is not a large difference in the relative mass of the product particles. For these applications it may be helpful to obtain a measure or parameter related to momentum resolution.\nU.S. Pat. No. 6,248,240 is a continuation in part of U.S. Pat. No. 6,096,220 and discusses a non-cylindrical chamber and a plasma source being located midway down the chamber. In addition it provides a method for maintaining a multi-species plasma at a low enough density such that collisions between the particles are relatively infrequent, and introduces one or more collectors positioned to intercept high mass particles.\nU.S. Pat. No. 6,235,202 is another continuation in part of U.S. Pat. No. 6,"} +{"output_text": " PTPases).\nProtein tyrosine kinases (PTKs) are a class of enzymes that catalyze the phosphorylation of specific tyrosine residues in cellular proteins. This post-translational modification of these substrate proteins, often enzymes themselves, acts to modulate cell growth, differentiation and/or proliferation. Aberrant or excessive PTK activity has been observed in many disease states including benign and malignant proliferative disorders as well as diseases resulting from inappropriate activation of the immune system (e.g., autoimmune disorders), allograft rejection", "input_text": " in some instances metal silicide layers are also required as contact layers for the source and drain regions. In those instances, it is desirable from the standpoint of reducing manufacturing cost to form the silicide layers for the polysilicon layer and the source and drain regions in the same processing step. Therefore, a need exists for a method for manufacturing an IGFET device which forms those silicide layers in a single processing step.\nIn some cases of practical importance, a need also exists for a relatively thin silicide layer for the source and drain regions (thereby establishing a prescribed low resistivity while maintaining shallow junctions) which layer is compatible with a relatively thick silicide layer (even lower resistivity) patterned to form part of the gate electrode. Protein phosphorylation is now well recognized as an important mechanism utilized by cells to transduce and regulate signals during different stages of cellular function (Hunter, Phil. Trans. R. Soc. Lond. B 353: 583\u2013605 (1998); Chan et al., Annu. Rev. Immunol. 12: 555\u2013592 (1994); Zhang, Curr. Top. Cell. Reg. 35: 21\u201368 (1997); Matozaki and Kasuga, Cell. Signal. 8: 113\u201319 (1996); Fischer et al, Science 253:401\u20136 (1991); Flint et al., EMBO J. 12:1937\u201346 (1993)). The level of tyrosine phosphorylation is balanced by the opposing action of protein tyrosine kinases and protein tyrosine phosphatases. There are at least two major classes of phosphatases: (1) those that dephosphorylate proteins (or peptides) that contain a phosphate group(s) on a serine or threonine moiety (termed Ser/Thr phosphatases) and (2) those that remove a phosphate group(s) from the amino acid tyrosine (termed protein tyrosine phosphatases or"} +{"output_text": " then the control of the gimbal suspension means can be carried out at a high speed.\nHowever, in the above-mentioned image stabilizing apparatus, the angular velocity sensor is mounted to the gimbal suspension means, and the angular position of the gimbal suspension means is detected by the angular velocity sensor. Therefore, the angular velocity sensor is required to be mounted to the gimbal suspension means, and the gimbal suspension means is required to be mounted to the image stabilizing apparatus", "input_text": " cost, life, time required for attaining a necessary inertial force after the power is turned on, and the like. If the effective diameter of objective lenses is made greater along with the increase in power or resolution of binoculars, then the erect prism becomes larger, whereby a large inertial force is required, which enhances the above-mentioned problems, and the power consumption increases along therewith.\nTherefore, the assignee of the present application has proposed an image stabilizing apparatus (Japanese Unexamined Patent Publication No. 6-250100) in which an angular velocity sensor is mounted to gimbal suspension means in place of the above-mentioned rotary inertial member, and the pivoting of the gimbal suspension means is controlled according to the output value from the angular velocity sensor, so as to fix the posture of the erect prism with respect to the earth (inertial system). According to this apparatus, the erect prism held with the gimbal suspension means basically has an inertial force. In particular, its posture-keeping capability against vibrations with relatively large amplitude is high with respect to high-speed vibrations with a high vibration frequency. Therefore, the control power for the rotational position according to the angular velocity sensor can be kept small. In other image stabilizing apparatus which drive vari-angle prisms or lenses, however, active driving sections are needed, and it is necessary for the driving sections to be operated at a high speed in order to correct large amplitude in high-frequency vibrations, whereby correction in a wide angle range is difficult.\nWhen binoculars and video cameras are used, panning and tilting are often carried out at a high speed. For example, fast pan/tilt operations are required when flying objects such as birds and airplanes are observed while being tracked.\nHence, if not only the angular velocity of gimbal suspension means but also its angular position is detected,"} +{"output_text": " the telephone companies for the use of the databases. In some cases, the charges are passed on to the service providers, and in other cases the charges are passed on to the telephone companies.\nIn the case of ICS providers, the charges for use of LIDB databases are passed on to the ICS providers, and the charges for use of other databases are passed on to the telephone companies. In the case of CLECs, the charges for use of LIDB databases are passed on", "input_text": " or inclination to provide billing collection for calls processed by an ICS provider. CLECs are to be contrasted with the traditional incumbent local exchange carriers (ILECs), such as a Regional Bell Operating Company (RBOC).\nSome estimates report that upwards of 20 percent of calls sought to be completed by ICS providers are ultimately to be completed by CLECs. When combined with mobile services such as cellular telephone services for which billing is sometimes problematic, the combined percentage of calls potentially associated with billing problems can easily approach or exceed twenty-five percent.\nTo help reduce the number of calls completed to telephone stations associated with cellular service, CLEC and other services that do not ensure collection and payment of charges for ICSs, and for other service providers concerned about billing of charges for services rendered, a number of databases are maintained by the telephone industry for checking the status of particular telephone lines. In particular, so-called Line Identification Databases (LIDB databases) provide a variety of information for telephone lines. Thus, a database query launched to a LIDB or similar database for a call can provide some information relating to whether calls to a particular line (and the account and phone number associated with it) can be counted on (or not) to be billed by the number provider.\nOther databases having special information about particular telephone subscriber accounts may also be maintained, including, generally, a class known as billed number screening databases or by other particular names. Some of these are maintained locally by service providers based on past history of payment, credit or other factors. All of these may help service providers, including ICS providers, to increase the share of calls that can actually be billed, thereby increasing the share of calls for which charges are actually collected.\nAs might be expected, use of LIDB and other telephone line or account databases by service providers seeking information about particular telephone lines or accounts incurs charges by"} +{"output_text": " is important that the water level in the tank be maintained at a level which ensures that the heating element is maintained at a temperature of approximately two hundred and twelve degrees fahrenheit.\nThe present invention is directed to a method and apparatus for controlling the temperature of a fluid. In one embodiment, the invention is directed to a method for controlling the temperature of a fluid. The method includes the steps of providing a fluid to be heated, providing a fluid to be cooled, and controlling the temperature of the", "input_text": " employ absorbent belts continuously rotating through first a water reservoir and then an air stream to cause humidity. Some employ pumps to lift water from a reservoir and pour it over a porous media through which air flows to cause similar humidification, and some employ wicking pads which are positioned partially below water level and partially above. In such humidifiers, the water level must be maintained for a different reason than that of the ultrasonic humidifier. Specifically, it is important that water level be maintained to ensure consistent humidity efficiency and maximum moisture output. Wick pads generally are capable of drawing water from the reservoir water level to a given height through capillary action. A relatively smaller portion of the wick pad must be positioned below the water level where water is absorbed, than above where air flowing through the pad causes the desired humidification. Excessive height of the pad above that height to which water will be drawn not only constitutes wasted wick material and is therefor inefficient by design, but also reduces the humidification efficiency of the humidifier by allowing a pathway for air which does not pass through the moistened portion of the pad, essentially constituting air leakage which reduces the total humidification rate. For this reason, wick type evaporative humidifiers are often designed to maintain a given water level which ensures that the most efficient amount of the wick pad lies above and below the water level to maximize efficiency and output. Accordingly, a water tank similar to that described above often is used with evaporative humidifiers.\nSteam humidifiers cause humidity by boiling water into vapor. A submersible heating element depends from a humidification unit into a boiling chamber within a base. A water tank similar to that described above is positioned on the base to both feed water to the boiling chamber and to maintain a given normal operating level therein. The boiling water maintains the temperature of the heating element at approximately two hundred and twelve degrees fahrenheit. It"} +{"output_text": " requires an additional movement for readying needle for injection.\nThe European patent application EP 0 603 817 discloses a needle protecting device comprising a tubular sleeve which is slidably mounted on the cannula. The sleeve is provided with a stop surface for the needle. The sleeve is moved by a spring device to a position in which the needle is covered. The sleeve is then moved by a second spring device to a position in which the needle is exposed. The sleeve is then moved by a third spring", "input_text": " insulin and human growth hormone which so far only can be administered by injections with therapeutical efficacy. Many patients who must be confined to such long term regimens are, however, severely discomforted by the constant use of needles and they frequently feel strong aversions to the visual appearance of a needle and the subsequent predicted pain sensation. In particular, many infant patients in the need of regular growth hormone injections might require devices that visually hides the needle during the entire administration procedure to overcome the inconvenience of the therapy. Especially, since the growth hormone injections, in contrast to e.g. insulin therapy for diabetes, is not linked to a direct alleviation of symptoms, growth hormone patients would benefit considerably from a device that can provide an acclimatization period to an initially uncomfortable therapy by means of injections.\nIn the International patent application WO 93/05835, a needle protecting device is disclosed comprising a displaceable tubular sleeve which in its extended position completely surrounds the injection needle. Such an arrangement is however, complicated in structure and therefore expensive, and might be difficult to use for children, since it requires that the user overcomes the forces of a spring device for injection. It must also be disengaged from the body of the pen-formed syringe and be remounted thereto after each change of needle or cannula which can be experienced as troublesome and in worst case might lead to an improper performance.\nA resilient needle protecting device, for example in the form of a tubular bellows, is disclosed by the German patent application DE 42 01 228. This device is arranged with an attached movable stop surface with a hole for the needle. It requires an additional movement for readying needle for injection. The attached stop surface also leads to that a certain part of the needle will not be used in injection. The International patent application WO 93/24162 discloses a protective sheath member slidably disposed over the cannula. This construction also"} +{"output_text": " ends of the elastic contact terminals 86.\nFIG. 23 shows a third embodiment of the conventional board-connecting connector (see Patent Document 3).\nThis board-connecting connector 89 includes: a coil spring 91 connected to an outer terminal 90 at an inside of a connector housing 92 made of insulating synthetic resin; a toggle switch 93 pushed forward by the coil spring 91; and a pair of upper and lower elastic contact terminals 94 fixed to conducting parts of the toggle switch 93, projected outward when", "input_text": " housings 76.\nA pair of upper and lower slope walls 79 are formed on a rear side of an inside of the connector housing 73. A spring 80 pushes top ends of the inner housings 76 in an opening direction. When the connectors 74, 78 are connected to each other, the top ends of the inner housings 76 are closed while sliding on the slope walls 79. Thus, inner elastic contact terminals 75 contact terminal parts of the print circuit board 72. Because a pair of inner housings 76 are open at a beginning of a connection of the connector 71, the connection is carried out with a low connection force.\nFIG. 22 shows a second embodiment of the conventional board-connecting connector (see Patent Document 2).\nThis board-connecting connector 81 includes: a coil spring 84 connected to an outer terminal 83 at an inside of a connector housing 82 made of insulating synthetic resin; a toggle switch 85 pushed forward by the coil spring 84; and a pair of upper and lower elastic contact terminals 86 fixed to conducting parts of the toggle switch 85, projected outward when the connector 81 is not connected, and received in the connector housing 82 when the connector 81 is connected.\nWhen the end of a circuit board 87 is inserted into an interior of the connector housing 82, the circuit board 87 pushes the toggle switch 85. Then, the toggle switch 85 and the elastic contact terminals 86 are moved backward, and then the pair of elastic contact terminals 86 hold the circuit board 87 in the connector housing 82. Because the elastic contact terminals 86 are open at the beginning of the insertion of the circuit board 87, the circuit board 87 is inserted with low insertion force.\nFor locking the circuit board 87 on the board-connecting connector 81, it is disclosed that holes (not shown) are formed on the circuit board 87, and projections (not shown) for engaging with the holes are formed at top"} +{"output_text": "TOF) are known as methods for solving the aforementioned problems.\nThe E-Trap is a method for trapping a charged particle by using a plurality of electrodes arranged in a plane. The E-Trap is a method for trapping a charged particle by using a plurality of electrodes arranged in a plane, and is a method for trapping a charged particle by using a plurality of electrodes arranged in a plane. The E-Trap is a method for trapping a charged particle by using a plurality of electrodes", "input_text": " wire\u201d and is generally used after being cut to be remarkably shorter than the lower steel wire B.\nFurther, in manufacturing such a conventional gabion mesh, only an intermittently automated process rather than a fully automated process can be employed. This is because a conventional method for manufacturing the gabion mesh employs the shortly cut upper steel wire A, a plurality of upper steel wires A should be generally supplied until the gabion mesh is completely manufactured using a single lower steel wire B, and respective tie operations for the upper steel wires A to the lower steel wire B should be manually performed. Thus, there is a disadvantage in that in manufacturing the conventional gabion mesh, the manufacturing process cannot be fully automated.\nFurthermore, there is a disadvantage in that skilled workers are required for manufacturing the conventional gabion mesh. This is because, upon manufacture of the conventional gabion mesh, the upper steel wires A should be repeatedly coupled to the upper slider during the manufacture thereof, and such coupling operations make the automation of the manufacturing process difficult and require craft of skilled workers.\nIn addition, there is a critical disadvantage in that the method for manufacturing the conventional gabion mesh has very low productivity. This is because the manufacturing process of the conventional gabion mesh is performed intermittently and depends on a partially automated process, at least two or three skilled workers are required according to the size of the gabion mesh, and it takes at least 20 to 30 minutes whenever the aforementioned coupling process is performed even by such skilled workers.\nSince these problems with the manufacturing process result from the configuration itself of the conventional gabion mesh, there are insoluble limitations on the problems so far as the coupling structure of the gabion mesh or each unit of the gabion mesh is not fundamentally changed. Electrostatic trap (E-Trap) and multi-pass time-of-flight ("} +{"output_text": ". In this case, a body (3) is exposed to coherent radiation (2) of pre-determined frequency, the radiation reflected by the body (3) or the radiation which has passed through the body (3) being imaged by an imaging optical system (6) in an image plane (7) in which a sensor (8) is located. A reference radiation generated in accordance with a shearing method is superimposed on the sensor (8), and the phase of the radiation (5)", "input_text": " and the IPv6 terminal can be realized by a simple operation of adding a fixed pattern of 96 bits to the IPv4 address or deleting the fixed pattern of 96 bits from the IPv6 address.\nAccording to the method called a dual stack, by selectively using the communication protocols of IPv4 and IPv6 in accordance with a communication partner, the communication between the IPv4 terminal and the IPv6 terminal can be realized.\nAccording to the method called an IP tunneling, by encapsulating the packet by the header of the relevant communication protocol and passing the resultant data through the network existing on the communication path between the two terminals, the communication between the two terminals can be realized. The invention relates to a method for the direct phase-angle measurement of radiation in accordance with light radiation reflected by a body (3) or passing through a transparent body, in which the body (3) is exposed to coherent radiation (2) of pre-determined frequency or the body (3) is coated with a lacquer in which particle diffusely reflecting the radiation are stored and which is exposed to non-coherent radiation (2) of a pre-determined frequency, the radiation reflected by the body (3) or the radiation which has passed through the body being imaged by an imaging optical system (6) in an image plane (7) in which a sensor (8) is located, a reference radiation generated in accordance with a shearing method being superimposed on the sensor (8), and the phase of the radiation (5) from the body (3) being determined from the measurement signals of the sensor (8). It further relates to an apparatus for the performance of such a method.\nA method for the direct phase-angle measurement of radiation, in particular of light radiation, and an apparatus for the direct phase-angle measurement of radiation, in particular of light radiation, are known from EP 0 419 936 B1"} +{"output_text": " is set by the user. The user sets the tension by turning the tension crank 120 until the string is taut. The user then sets the tension by turning the tension crank 120 in the opposite direction. The tension head assembly 100 is designed so that the tension in the string is set at a predetermined tension. Historically, in tennis racket stringing machines, this predetermined tension is set by the user. The user sets the tension by turning the tension crank 120 until the string is taut. The user then", "input_text": " must continuously be improved to provide better overwrite (OW) capabilities and reduced fringing fields as track pitch increases with reduced write track width and write gap. In FIG. 9, a graph 900 shows a three-dimensional finite-element calculation of deep gap field vs. the current-coil-turn product (where N is coil turns and I is current through the coil). As apparent from graph 900, a short throat height is imperative to achieve a high deep gap field for narrow track write heads, which corresponds to a higher write field for superior writeability.\nWhat is needed is an improved write head design and apparatus which provides for a reduced throat height and a superior mechanical stability. Tennis rackets are strung with the use of a stringing machine. FIG. 1 displays a standard embodiment of the prior art. A tennis racket 20 is placed in a mounting plate 60 and clamped in place. A string is threaded through grommets in the tennis racket. The string is held in place within the tennis racket by a string clamp. The free end of the string is threaded through a roller mounted within the tension head 140. The tension head 140 is incorporated with other items to comprise the tension head assembly 100. The tension head assembly 100 is mounted on the winder bar 40. The tension head assembly 100 includes a tension crank 120. Turning the tension crank 120 causes the tension head assembly 100 to move along the winder bar 40. When a string is threaded through the tension head 140, a user can turn the tension crank 120 to move the tension head 140 away from the mounting plate 60. This movement pulls on the string and creates the necessary tension in the string until it is secured in place on the racket 20.\nThe tension head assembly 100 is designed so that the tension in the string is set at a predetermined tension. Historically, in tennis racket stringing machines, this predetermined tension"} +{"output_text": " bottom sides of a circuit board. The electronic connectors are installed in the top and bottom sides of the circuit board, and the terminals of the electronic connectors are soldered to the respective pads of the circuit board. The electronic connectors are installed in the top and bottom sides of the circuit board, and the terminals of the electronic connectors are soldered to the respective pads of the circuit board. The electronic connectors are installed in the top and bottom sides of the circuit board, and the terminals of the electronic connectors", "input_text": " respective through holes and then fixedly soldered to the respective through holes of the circuit board. Alternatively SMT (surface mounting technology) may be employed to bond the terminals of the electronic device to the tin paste at the respective pads of the circuit board. For mass production, SMT is commonly used to install electronic devices in circuit boards at two sides. During the application of SMT, an electrically conductive medium, for example, tin paste is applied to the circuit board, and then electronic devices are attached to the circuit board, and then the circuit board with the electronic devices are put in a high temperature stove for baking, causing the electrically conductive medium to be melted and bonded to the respective mounting ends of the terminals of the electronic devices. After cooling, the terminals of the electronic devices are fixedly and electrically connected to the tin paste at the respective pads (contacts) of the circuit board. Because the terminals of the electronic devices are to be fastened to the respective pads of the circuit board, the electrically insulative shell of each electronic device has terminal holes through which the respective terminals extend to the outside for mounting. During baking in the high temperature stove, a siphon effect may be produced in the terminal holes, thereby causing the molten electrically conductive medium (the tin paste) to be sucked into the inside of the electrically insulative shell and covered over the contact end of each terminal. When an overflow of electrically conductive medium (the tin paste) occurs, the terminals may be short-circuited, or the structural strength of the terminals may be weakened. Further, in order to minimize installation space, electronic devices and/or connectors may be installed in both the top and bottom sides of a circuit board. In this case, the electronic devices and/or connectors at the front side of the circuit board will be heated twice in the high temperature stove.\nFIG. 7 shows electronic connectors installed in top and"} +{"output_text": ". The PDTM 110 is connected to a phone 120 via a USB cable 130. The phone 120 is connected to a PDA 140 via a Bluetooth connection 150. The PDTM 110 transfers data from the phone 120 to the PDA 140. The PDA 140 may be a personal digital assistant (PDA) such as a Palm Pilot, a Blackberry, a Treo, etc. The PDTM 110 may also transfer data from the PDA 140 to the phone 120.\nIn FIG", "input_text": " the resonant frequency of the ceramic element.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described wherein sequential application of a voltage across input electrode segments creates a traveling compression wave in the ceramic element.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described wherein such a traveling compression wave in the ceramic element creates output voltages whose phases are dependent on the phases of the input voltages.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described with a traveling compression wave in the ceramic element that may be used in polyphase power applications.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described wherein the output voltages may drive more than one load.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described wherein the output voltages may be added in series to drive a single load.\nIt is another object of the present invention to provide a piezoelectric transformer of the character described wherein the output of one electrode segment may be used to provide a feedback voltage to the driving circuit of the piezoelectric transformer.\nFurther objects and advantages of my invention will become apparent from a consideration of the drawings and ensuing description thereof. Often, transferring data in phones can be very cumbersome. In particular, modern phones may hold multiple gigabytes of data comprising pictures and other graphical representations, address records, emails, etc. A lot of overhead going through the applications creates a data bottleneck for service stations and other stores that offer such data transfer services.\nFIGS. 1A and 1B show two typical telephone/PDA device data transfer stations. In FIG. 1A, transfer station 100 has a phone data transfer machine (PDTM) 110, typically a PC with USB and Bluetooth connectivity running phone data transfer applications such as PC Suite, PC Tools and other phonebook transfer applications"} +{"output_text": " entries are not used in the mapping.\nThe mapping from CSI reference signal configuration to (k\u2032, l\u2032) for normal cyclic prefix is illustrated in Table 6.10.5.2-2: illustrating a mapping from CSI reference signal configuration to (k\u2032, l\u2032) for extended cyclic prefix, additionally including the identification of Cell index, CDM group.\nIn addition, Cell index and CDM group entries in brackets are meant to indicate which entries are not used in the mapping", "input_text": "(2.2)1Px(Hy)(Cu)16(1.2)1Qx(Dy)(Dy)17(0.2)1Rx(Iy)(Du)18(3.5)1Sx(Ey)(Ey)19(2.5)1Tx(Jy)(Eu)Frame20(11.1)1Ax(11.1)1Ax(11.1)1Axstructure21(9.1)1Bx(9.1)1Bx(9.1)1Bxtype 222(7.1)1Cx(7.1)1Cx(7.1)1Cxonly23(10.1)1Dx(10.1)1Dx(Az)24(8.1)1Ex(8.1)1Ex(Bz)25(6.1)1Fx(6.1)1Fx(Cz)26(5.1)1Gx(Ay)(Ay)27(4.1)1Hx(Dy)(Au)28(3.1)1Ix(By)(By)29(2.1)1Jx(Ey)(Bu)30(1.1)1Kx(Cy)(Cy)31(0.1)1Lx(Fy)(Cu)\nThe table corresponds to that included 3GPP TS 36.211 V12.3.0 under section 6.10.5.2 in Table 6.10.5.2-1: illustrating a mapping from CSI reference signal configuration to (k\u2032, l\u2032) for normal cyclic prefix, additionally including the identification of Cell index, CDM group.\nIn addition, Cell index and CDM group entries in brackets are meant to indicate which"} +{"output_text": " a bottom-up analysis scheme. This higher level of structure is the structure of the text regions themselves.\nThe present invention further presents a method for classifying the text regions of a scanned page into two kinds of regions: (i) regions encompassing running text, i.e., text formatted in paragraphs and columns; and (ii) regions encompassing text formatted in other layout structures, such as headings, lists, and tables. This classification further, supports the present invention in the classification of the non", "input_text": " of simplifying the description of the instant invention.\nThis invention first presents a geometric, bottom-up method for partitioning a scanned page into two kinds of regions: (i) regions encompassing running text, i.e., text formatted in paragraphs and columns; and (ii) regions encompassing text formatted in other layout structures, such as headings, lists, and tables. This partitioning further, supports the present invention in the classification of the non-running text regions so as to enable selective scanning for text recognition within non-running text regions. For example, table detection, crucial for establishing reading order, is done by testing the non-running text regions for alignment and similar relationships in order to classify the image.\nPage layout analysis processes may be divided into two broad categories, geometric layout analysis and logical structure analysis (R. Haralick. Document image understanding: geometric and logical layout. Proc. IEEE Conf. On Computer Vision and Pattern Recognition, 1994: 385-390). Geometric layout analysis, also termed bottom-up analysis, extracts whatever structure can be inferred without reference to models of particular kinds of pages - e.g., letter, memo, title page, table, etc. Intuitively, this is the structure that would be apparent even to an illiterate person. Also, it is the structure common to pages of all kinds. Logical structure analysis. classifies a given page within a repertoire of known layouts, and assigns functional interpretations to components of the page based on this classification. Geometric analysis is generally preliminary to logical structure analysis.\nBottom-up analysis schemes attempt to segment a page into homogeneous regions of text, line art (graphics), and photographs (halftone images), and then stop. This is normally taken to be the highest level of structure that can be established bottom-up. However, the present invention establishes that a yet higher level of structure can be extracted using"} +{"output_text": " the bladder neck and the prostate. The present invention is directed to a minimally invasive treatment for BPH that is less traumatic to the patient and that does not require the removal of the urethral lining, the bladder neck or the prostate.\nThe present invention is directed to a method and apparatus for treating BPH. The apparatus includes a catheter having a distal end and a proximal end. The catheter includes a lumen extending from the proximal end to the distal end. The catheter includes a plurality of electrodes disposed", "input_text": " away from a curved blade finding this does not match up well with the shape of lumens (at least in arteries) resulting in cutting into the lumen too deeply in parts to create an uneven inner surface (see Abstract, claims 5, 3:59 and 1:53-2:8.) There is also no disclosure of a rotating, circular, guillotine, or end cutting motion. Rather, the cutting mechanism disclosed in U.S. Pat. No. '130 results from one or more straight edged blades sliding or reciprocating back and forth across a rectangular cutout window at the distal end of a housing, severing atheromas or tissue as it closes the window. The straight cutting blade or blades are as long as the cutout window. Thus, in the cutting position in which the window has just closed, the straight cutting blade(s) completely occlude the window so that no more tissue can enter or exit (3:9-15). U.S. Pat. No. '130 also refers only to electrical activation of the cutting elements while the present invention also includes manual mechanical, pneumatic, hydraulic, and solar-powered electrical activation. More specifically, U.S. Pat. No. '130 describes electrically heating one or more elements on the blade for ablating tissue when adapting the device for prostatectomy (2:9-19 and 7:53-63) and is self-described as a device for performing the TURP procedure \u201cin which an enlarged or diseased prostate gland is removed\u201d (2:10-11). This implies the blade must be moved slow enough to allow time for heat transfer to the tissue. A heating element on the blade suggests tissue is ablated with heat rather than mechanically severed.\nThus, the common approaches to BPH treatment are not minimally invasive and result in trauma to and the removal of the urethral lining,"} +{"output_text": " The dye pack is designed to explode in a few seconds, releasing the dye and marking the money. The dye is usually a dye that is not visible to the naked eye, such as a fluorescent dye. The dye is usually a dye that is not visible to the naked eye, such as a fluorescent dye. The dye is usually a dye that is not visible to the naked eye, such as a fluorescent dye. The dye is usually a dye that is not visible to the naked eye, such as", "input_text": " in conviction. This is purely a method for identification after apprehension.\nBanks also use explosive devices and tracking systems both for apprehension (and recovery) as well as prevention. They can be used in prevention when word is leaked that these devices are used in a particular area. It is stressed that training in this case is critical because many bank robbers present instructions to not pass \u201cbait money or dye packs\u201d since knowledge of these methods are apparently fairly well known. Each bank establishes their own policies and methods for training and passing these devices to a robber. The dye pack was invented to stain stolen currency in a manner that \u201crenders a bank robbery pointless\u201d as taught by Robeson (U.S. Pat. No. 3,564,525) and Howett (U.S. Pat. No. 1,923,979). It does this by exploding a colored dye that stains the money, making it obvious that it has been involved in a robbery. In some devices, tear-gas is included in the dye to not only mark the presence of the stolen currency but also show a colored cloud for searching by law-enforcement. A dye pack consists of a hollowed-out stack of real bills with chemicals and electronics inside, usually with one or two bills stuck on the top and bottom of the stack. The dye pack sits idle in \u201csafe mode\u201d while in the teller drawer on a magnetic plate until a robbery occurs. During the robbery the teller is supposed to subtly slip the device in with other money. Removing the device from the magnetic plate does not cause the dye to be released; it is simply armed at that point. When the bank robber passes a radio activation field near the front door the device is programmed to start a timer for later release, allowing time for the bank robber to get some distance from the bank before the money is stained."} +{"output_text": " of myopathy, encephalopathy, lactic acidosis, and stroke-like episodes (SLEs). The biochemical hallmark of complex I deficiency is a decrease in the activity of complex I, which is the largest and most complex of the respiratory chain complexes. Complex I deficiency is caused by mutations in the nuclear gene encoding the NADH dehydrogenase subunit 6 (ND6), which is located in the mitochondrial DNA (mtDNA). The ND6 gene encodes a protein of 1,977 amino acids,", "input_text": "axia, and raised cerebrospinal fluid (CSF) protein levels (e.g., >100 mg/dL). Additional features associated with KSS may include myopathy, dystonia, endocrine abnormalities (e.g., diabetes, growth retardation or short stature, and hypoparathyroidism), bilateral sensorineural deafness, dementia, cataracts, and proximal renal tubular acidosis. Thus, KSS may affect many organ systems.\nCo-Enzyme Q10 Deficiency is a respiratory chain disorder, with syndromes such as myopathy with exercise intolerance and recurrent myoglobin in the urine manifested by ataxia, seizures or mental retardation and leading to renal failure (Di Mauro et al., (2005) Neuromusc. Disord., 15:311-315), childhood-onset cerebellar ataxia and cerebellar atrophy (Masumeci et al., (2001) Neurology 56:849-855 and Lamperti et al., (2003) 60:1206:1208); and infantile encephalomyopathy associated with nephrosis. Biochemical measurement of muscle homogenates of patients with CoQ10 deficiency showed severely decreased activities of respiratory chain complexes I and II+III, while complex IV (COX) was moderately decreased (Gempel et al., (2007) Brain, 130(8):2037-2044).\nComplex I Deficiency or NADH dehydrogenase NADH-CoQ reductase deficiency is a respiratory chain disorder, with symptoms classified by three major forms: (1) fatal infantile multisystem disorder, characterized by developmental delay, muscle weakness, heart disease, congenital lactic acidosis, and respiratory failure; (2) myopathy beginning in childhood or in adult life, manifesting as exercise intolerance or weakness; and (3) mitochondrial encephalomyopathy (including MELAS), which may begin in childhood or adult life and consists"} +{"output_text": "899 (Gheewala, et al.), entitled \"Method and Apparatus for Sensing Defects in Integrated Circuit Elements\", incorporated by reference herein, describes a method of testing for defects in an integrated circuit element by comparing the output of the element with a reference voltage. The reference voltage is generated by a reference cell. The reference cell is formed by a plurality of transistors, each transistor having a gate connected to a different one of a plurality of reference lines. The reference lines are", "input_text": " ON or OFF state of each switch is governed by a control input from a probe line. The probe and sense lines are connected to external test electronics. By excitation of an appropriate probe line, and monitoring (or exciting) an appropriate sense line, test signals present at any one of the test points can be monitored (or controlled). Generally, four lines per die are required: power, ground, a plurality of probe lines, and a plurality of sense lines. (In the case of a die requiring multiple supply voltages,\nU.S. Pat. No. 4,749,947 also suggests the possibility of cross-checking multiple ICs on a wafer. FIG. 7 therein shows a grid of numerous probe and sense lines criss-crossing multiple ICs. FIGS. 9a and 9b therein also show many ICs being cross-checked on a wafer. In FIG. 9a, the usually unused \"kerf area\" (scribe line) lying between adjacent ICs is used to place probe points for the probe and sense lines. In FIG. 9b, it is suggested that I/O pads on \"other\" (typically adjacent) ICs can be used as probe points for cross check testing a particular ICs, when the \"other\" ICs are not being cross checked.\nU.S. Pat. No. 4,937,826 (Gheewala, et al.), entitled \"Method and Apparatus for Sensing Defects in Integrated Circuit Elements\", incorporated by reference herein, describes an improvement to the technique of the aforementioned U.S. Pat. No. 4,749,947, involving pre-charging the sense lines to adjust detection levels. The patent also discloses a method of reducing test patterns to Boolean expressions, using \"path sensitization\".\nU.S. Pat. No. 4,975,"} +{"output_text": "5) cyclic anhydrides; (6) acid esters; (7) phosphine oxides; and (8) mixtures thereof; (c) an optional filler; (d) an optional antistatic agent; and (e) an optional biocide.\nU.S. Pat. No. 5,262,259 discloses a toner composition comprising toner particles comprised of a polymer with a glass transition temperature of from about 50.degree. C. to about 100.degree. C. and", "input_text": " (C) vinyl alcohol-vinyl acetal copolymers; (D) polycarbonates; and (E) mixtures thereof; and (2) an additive having a melting point of less than about 65.degree. C. and a boiling point of more than about 150.degree. C. and selected from the group consisting of (1) furan derivatives; (2) cyclic ketones; (3)lactones; (4) cyclic alcohols; (5) cyclic anhydrides; (6) acid esters; (7) phosphine oxides; and (8) mixtures thereof; (c) an optional filler; (d) an optional antistatic agent; and (e) an optional biocide. Also disclosed is a process for generating images which comprises (1) generating an electrostatic latent image on an imaging member in an imaging apparatus; (2) developing the latent image with a toner which comprises a colorant and a resin selected from the group consisting of (A) polyesters; (B) polyvinyl acetals; (C) vinyl alcohol-vinyl acetal copolymers; (D) polycarbonates; and (E) mixtures thereof; and (3) transferring the developed image to a recording sheet which comprises (a) a substrate; (b) a coating on the substrate which comprises (1) a binder selected from the group consisting of (A) polyesters; (B) polyvinyl acetals; (C) vinyl alcoholvinyl acetal copolymers; (D) polycarbonates; and (E) mixtures thereof; and (2) an additive having a melting point of less than about 65.degree. C. and a boiling point of more than about 150.degree. C. and selected from the group consisting of (1) furan derivatives; (2) cyclic ketones; (3) lactones; (4) cyclic alcohols; ("} +{"output_text": " friendly\u201d).\nGPS receivers are typically used in a variety of applications, including transportation applications. For example, a GPS receiver may be used to determine the position of a vehicle, such as a truck, train, or ship, as it travels along a road, a rail line, or a waterway. In such applications, the vehicle may be equipped with a GPS receiver that is capable of determining its position, and the position of the vehicle may be determined by a GPS receiver at a fixed location", "input_text": "\u2019 from space to space on the reel, with any awards determined for the single wild symbol being positioned at each location on the screen. Among the most detailed sequence of events employed in one embodiment comprise the steps of showing a triggering symbol to initiate the progressively moving wild symbol feature. The number of lines and amount of wager are carried over. Sounds accompany the progressively moving wild symbol feature. The moving wild symbol changes back-and-forth between images (e.g., an iceberg and a penguin). The win meter increments for each partial pay feature. 1. Field of the Invention\nThe present invention relates generally to positioning systems, and more particularly to methods for using such systems to determine relative differential positioning for transportation applications.\n2. Related Art\nAs is well known in the relevant art(s), the Department of Defense's Global Positioning Satellite (GPS) constellation operationally consists of twenty-four satellites that provide global coverage for determining the geographic position of a user equipped with any of a variety of commercially-available receivers. GPS receivers are capable of receiving the L-band radio signals emitted from the satellites in the constellation whose orbits have an altitude of approximately 12,660 miles above the Earth. For any given signal reading, at least four satellites are required to compute the three dimensions of position (X, Y, and Z or latitude, longitude and altitude, respectively) and time.\nMore specifically, GPS receivers receive transmissions of at least four satellites and combine the information with information in an electronic almanac, so that it can mathematically determine the receiver's position on Earth in a well-known manner. The basic information a GPS receiver provides is the latitude, longitude and altitude, or some similar measurement, of its current position. Most receivers then combine this data with other information, such as maps, to make the receiver more useable (i.e., more \u201cuser"} +{"output_text": " are characterized by different physical properties, such as solubility, melting point, density, etc. The two polymorphic forms are known to be different in their chemical and biological properties. For instance, the solubility of the sodium salt of fosinopril in water is reported to be about 0.1 mg/ml for Form-A and about 0.5 mg/ml for Form-B. The melting point of the sodium salt of fosinopril is reported to be about 140\u00b0 C. for", "input_text": "omer (9 B). (i) Besides the aforementioned process patents, various methods are reported for preparation of key intermediates required for synthesis of fosinopril. For instance, U.S. Pat. No. 4,168,267 (Petrillo, Jr., et. al.), U.S. Pat. No. 4,384,123 (Petrillo. Jr., et. al.), U.S. Pat. No. 4,448,772 (Karanewsky et. al.), U.S. Pat. No. 4,594,199 (Thottathil et. al.) and U.S. Pat. No. 4,602,092 (Thottathil et. al.) disclose processes for synthesis of the phosphinyl acetic acid fragment of fosinopril. U.S. Pat. No. 4,316,905 (Krapcho et. al.), U.S. Pat. No. 4,501,901 (Thottathil et. al.), U.S. Pat. No. 4,588,819 (Thottathil et. al.), U.S. Pat. No. 4,734,508 (Thottathil et. al.), U.S. Pat. No. 4,912,230 (Anderson et. al.), U.S. Pat. No. 4,912,231 (Kronenthal et. al.) and U.S. Pat. No. 4,937,355 (Kloss et. al.) describe processes for synthesis of the optically active (cis/trans)-4-cyclohexyl-L-proline fragment. (ii) In addition, it is known that the sodium salt of fosinopril can exist in two polymorphic forms, designated as Form-A and Form-B. The polymorphic forms"} +{"output_text": " compounds.\nThere is also a need to rapidly assay or screen compounds for their effects on various biological processes. Nearly all biological activity is regulated by the interactions of proteins in cells. Proteins are the catalysts, motion transducers, and signal mediators of cells. They control cell division, cell growth, cell differentiation, cell death, and mediate the responses of cells to their environments. Enzymologists have long sought better substrates, better inhibitors and better catalysts for enzymatic reactions. To understand cellular processes, we therefore", "input_text": "\nA further object of the present invention is to provide a new containerization system for agrichemicals which allows very efficient packing and storing due to flexible, optionally flat bags.\nAn object of the invention is to avoid the risk of spill or pollution and to increase the safety of water soluble packaging of agrichemicals.\nOther objects of the invention will better appear from the following description. The present invention relates to multiple array systems for integrating arrays of biomolecules, including biological, chemical and biochemical elements.\nThere is a need to rapidly assay compounds for their effects on various biological processes. Nearly all biological activity is regulated by the interactions of proteins in cells. Proteins are the catalysts, motion transducers, and signal mediators of cells. They control cell division, cell growth, cell differentiation, cell death, and mediate the responses of cells to their environments. Enzymologists have long sought better substrates, better inhibitors and better catalysts for enzymatic reactions. To understand cellular processes, we therefore need to monitor the activity of proteins, and to determine the networks of interactions of proteins within cells.\nIn the past, tools available to biologists only allowed the study of one interaction at a time because there were no analytical tools that would allow large numbers of protein interactions to be monitored simultaneously. Thus, a system that would allow parallel analyses of protein interactions would be of immense value and would speed the progress of biological discovery.\nIn addition, there is a need to rapidly assay or screen compounds for potential drug candidates. Drug discovery is a long, multiple step process involving the identification of specific disease targets, development of assays based on a specific target, validation of the assays, and optimization and automation of the assay to achieve screening of a large number of candidates. After high throughput screening of compound libraries using various assays, hit validation and hit compound optimization procedures are employed. Performing a screen on many thousands of compounds thus requires parallel processing of many"} +{"output_text": "ropyl)trisulfide, 3,3xe2x80x2-bis(trimethoxysilyl-2-methylpropyl)tetrasulfide, 3,3xe2x80x2-bis(dimethoxyphenylsilyl-2-methylpropyl)disulfide, 3,3xe2x80x2-bis(trimethoxysilyl-2-ethylpropyl)tetrasulfide, 3,3xe2x80x", "input_text": " 3,3xe2x80x2-bis(butyl dimethoxysilyipropyl)trisulfide, 3,3xe2x80x2-bis(phenyl dimethoxysilylpropyl)tetrasulfide, 3-phenyl ethoxybutoxysilyl 3xe2x80x2-trimethoxysilylpropyl tetrasulfide, 4,4xe2x80x2-bis(trimethoxysilylbutyl)tetrasulfide, 6,6xe2x80x2-bis(triethoxysilylhexyl)tetrasulfide, 12,12xe2x80x2-bis(triisopropoxysilyl dodecyl)disulfide, 18,18xe2x80x2-bis(trimethoxysilyloctadecyl)tetrasulfide, 18,18xe2x80x2-bis(tripropoxysilyloctadecenyl)tetrasulfide, 4,4xe2x80x2-bis(trimethoxysilyl-buten-2-yl)tetrasulfide, 4,4xe2x80x2-bis(trimethoxysi lylcyclohexylene)tetrasulfide, 5,5xe2x80x2-bis(dimethoxymethylsilylpentyl)trisulfide, 3,3xe2x80x2-bis(trimethoxysilyl-2-methylpropyl)tetrasulfide, 3,3xe2x80x2-bis(dimethoxyphenylsilyl-2-methylpropyl)disulfide.\nThe preferred sulfur containing organosilicon compounds are the 3,3xe2x80x2-bis(trimethuxy or triethoxy silyip"} +{"output_text": " liquid crystal dispersion 4 is broadened in the visible region. Therefore, the display color is not sufficiently sharp.\nIn order to solve this problem, a method of adding a dye to the polymer/cholesteric liquid crystal dispersion has been proposed. However, the addition of a dye to the polymer/cholesteric liquid crystal dispersion is not preferable because the dispersion is colored.\nAs described above, the polymer/cholesteric liquid crystal dispersion has a problem in that the display color is not", "input_text": " is applied between the pair of electrodes. In this state, the spiral is loosened, molecules are oriented perpendicular to the surface of the substrate, and light is transmitted. These three oriented states can be switched among each other by applying voltage between the electrodes.\nAccordingly, if a light absorber having a color such as a black color is disposed on the backside of the cell, it is possible to obtain a bright display colored with the selective reflection color during the P orientation and a dark display colored with the black color of the light absorber during the F or H orientation. Among the above orientation forms, both the P orientation and the F orientation can exist stably without using any power source. The utilization of this property makes it possible to attain a memory display in which a display is maintained without using any power source.\nOn the other hand, a structure is known in which a polymer/cholesteric liquid crystal dispersion 4 obtained by dispersing a cholesteric liquid crystal 2 as particles in a polymer 1 is sandwiched between a pair of substrates 21 and 22 having electrodes 11 and 12, as shown in FIG. 9, instead of sealing the cholesteric liquid crystal directly between a pair of substrates having electrodes.\nIn this case as well, the above display principle may be similarly utilized. The polymer/cholesteric liquid crystal dispersion is more resistant than ordinary liquid crystal cells to stresses applied from the outside. Therefore, the dispersion is not only resistant to the breakdown of a stored image but can also be apparently handled as a solid. As a result, there are advantages in that the polymer/cholesteric liquid crystal dispersion can be handled in, for example, a production process more easily than a liquid cholesteric liquid crystal and can be laminated on other functional films such as an optical conductor.\nAs shown in FIG. 10, however, the reflection spectrum of the polymer/cholesteric"} +{"output_text": "level 4) 30 is the next highest resolution scale, and level five 32 is the next highest resolution scale. The lowest resolution scale (level 5) 32 is the next highest resolution scale, and level six 34 is the next highest resolution scale. The lowest resolution scale (level 6) 34 is the next highest resolution scale, and level seven 36 is the next highest resolution scale. The lowest resolution scale (level 7) 36 is the next highest resolution scale, and level eight 38 is the next highest", "input_text": " are sent to a client system with minimal processing, i.e. as much processing is done ahead of time as possible. Optionally, although not required in tile-based mapping systems, tile addressing can follow a single global projection, tiles can be distributed using a client/server system architecture, and tiles can be organized into relatively few, fixed layers. Tile images can be of any size, and can vary from map scale to map scale, or can vary across the same scale or be random in size. What is needed is a system to determine which size is most efficient for a specific purpose.\nReferring now to FIG. 1A (PRIOR ART), tiled image sets are created from collections of random source images that may have sizes and boundaries that follow no specific system. Collections of source images can take a variety of forms, for example, high resolution aerial imagery of the fifty largest cities, each represented by a small number (for example 5-50) of large source images (for example 10,000\u00d710,000 pixels). Each of the source images can be a different size, cover a different portion of the earth, or have a different map resolution. Taken together, all the source images for all the cities form a single map layer. Each layer 12 of tile images has multiple scales or levels 14. Every tile set starts with a base scale 221 which is the scale with the highest number and highest resolution imagery. Each subsequent scale is a lower version of the scale preceding it. In the tile creation process, base scale 221 (the highest resolution scale) can be completed before lower resolution scales. Scales 14 include tiles 18 that are addressable by columns 423. Referring now to FIG. 1B (PRIOR ART), exemplary base scale (level 3) 28 is the highest resolution scale in the exemplary series including level one 24 and level two 26, the lowest resolution level ("} +{"output_text": " layer.\nIn the case of the UBM structure including the nickel layer, the nickel layer is formed by electroplating or electroless plating. The nickel layer is formed by electroplating in the following manner. First, a seed layer is formed on the pad by sputtering. Then, a nickel layer is formed on the seed layer by electroplating. The seed layer is formed by sputtering to prevent the nickel layer from being formed on the pad. The seed layer is formed", "input_text": " process trended toward using environmentally safe materials, such as a Pb-free solder. Many problems are caused in the flip chip process including the finding that the UBM structure cannot effectively prevent the diffusion between the solder and the pad upon use with 63-37% Pb process solder and Pb-free solder, and in particular, Pb-free solder. Many researchers are working on UBM suitable for use with Pb-free solder, using sputtering, electroplating and electroless plating methods.\nMost Pb-free solders developed until now have a large amount of tin. The Pb-free solder materials, suitable for use in the flip chip interconnections, are exemplified by Sn-3.5% Ag, Sn-0.7% Cu, and Sn-3.8% Ag-0.7% Cu. These materials contain 95% or more of Sn. Since the tin element rapidly reacts with copper, tin in the solder reacts with copper of UBM by heat generated in the course of reflow of the flip chip or use of the chip. Thus, an intermetallic compound is formed at the interface of UBM and the solder, and the copper is self-extinguished. If the intermetallic compounds are excessively formed or the copper layer in the UBM is completely self-extinguished, bonding strength between the solder and the pad is drastically decreased. Hence, UBM for use with Pb-free solder having high Sn content requires a novel diffusion barrier. In this regard, nickel is used. Nickel is slower in reaction rate with tin than copper. Until now, there have been proposed various processes for the formation of the diffusion barrier made of nickel or nickel alloy through sputtering, electroplating and electroless plating methods. However, the UBM structure including the nickel layer suffers from the problems related with poor solderability of nickel and residual stress in the nickel"} +{"output_text": " expansion between the first and second substrates.\nIn another aspect, the method comprises patterning a plurality of interconnection elements on the surface of a first substrate, each interconnection element having an attachment element coupled to a first substrate and a body comprising a plurality of resilient elements, the attachment element coupled to a first surface of the body, a second surface of the body having a contact region capable of contacting a terminal of a chip-scale device. The first substrate is brought together with a second substrate so that", "input_text": "connection element are improved over single beam spring interconnection elements, particularly in sub-micron pitch spacing range of current and future technologies of contact pads or terminals of an integrated circuit. For example, the multiple-leaf portion body of the interconnection element of the invention can achieve improved mechanical properties such as spring constant, compliance, and lower material stress over single beam spring interconnection elements in fine-pitch applications.\nThe interconnection elements formed by the different aspects of the method of the invention are suitable for making either temporary or permanent electrical connection between contact pads or terminals of an electronic component such as a PCB and a chip under test. In this regard, a method of making electrical connections is disclosed. In one aspect, the method comprises patterning a plurality of interconnection elements on the surface of a first substrate, each interconnection element having an attachment element coupled to a first substrate and a body comprising a plurality of resilient elements, the attachment element coupled to a first surface of the body, a second surface of the body having a contact region capable of contacting a terminal of a chip-scale device. The first substrate is brought together with a second substrate so that the contact regions of the interconnection elements are in contact with the second substrate. For making temporary connection, the first substrate is brought together with another substrate, such as an electronic component, where the contact regions of the second substrate are electrical contacts such as terminals. The interconnection elements react resiliently to maintain contact pressure and, in one embodiment, to maintain an electrical connection between the two components. For making permanent connection, the first substrate is brought together with the second substrate and the contact regions of the interconnection elements are joined or bonded, such as by soldering, welding, or brazing or with a conductive adhesive, to, for example, a terminal of the other substrate. In one embodiment, the interconnection elements are compliant and may accommodate differential thermal"} +{"output_text": "s)) to the printing medium. However, when the printing is performed for a long time, the ink amount in the sub tank is gradually reduced. When the ink amount in the sub tank is reduced to a predetermined amount or less, the ink supply to the printing head is stopped.\nIn the above-described ink supply method, the ink amount in the sub tank is reduced to the predetermined amount or less when the printing is performed for a long time. However, the ink amount in the sub", "input_text": ", of course, requires some means for admitting the coolant from the pool into the pipes of the circulatory cooling system, either by the use of check valves or remotely controlled valves. However, these introduce yet another variable to the system in that the valves require moving parts and provide no assurance that they will work when needed. Thus, to increase the reliability of the system, redundancy is required, which, in turn, greatly enhances the cost and complexity of the system.\nObviously, there is need for an emergency core cooling system which could remove the decay heat from the reactor and which does not require the use of moving parts within the reactor vessel. 1. Field of the Invention\nThe present invention relates to a printing apparatus, a printing system, and a prediction method of the usage of a printing agent that can be configured to predict, prior to the development processing of a printing image (to-be-printed image), the usage of the printing agent (e.g., ink) required for the printing.\n2. Description of the Related Art\nA printing apparatus requires a printing agent (e.g., ink) in order to provide printing to a printing medium. One ink supply method is, for example, an on-demand supply method (also called as pit in method) for a serial scan type printing apparatus in which a carriage having a reciprocating movement in a main scanning direction includes an ink jet printing head. This method is a method in which, only when ink needs to be supplied to a tank included in the carriage (sub tank), ink is supplied from a tank (main tank) in a printing apparatus body to the sub tank by allowing the sub tank to communicate with the main tank.\nGenerally, when such an ink supply method is practically used, the sub tank has a capacity for retaining the ink amount for providing the printing (printing of solid image for one to two page("} +{"output_text": " in the ATP-sensitive potassium channel gene and which are useful for the study of the physiological role of the ATP-sensitive potassium channel and for the development of improved therapeutics for non-insulin dependent diabetes mellitus has been desired.\nThe present invention has been made in view of the above circumstances, and an object of the present invention is to provide a novel ATP-sensitive potassium channel gene, a novel ATP-sensitive potassium channel protein, a novel ATP-sensitive potassium channel gene product, a novel ATP-", "input_text": "a receptor (SUR) has been found by Aguilar-Bryan et al. in the process of cloning Kir6.2 (BIR) [Aguilar-Bryan, L. et al., Science, 268:423-426 (1995)]. Now it is known that the pancreatic.beta.-cell K.sub.ATP channel comprises at least two subunits, a Kir6.2 (BIR) and a sulfonylurea receptor (SUR), and mutation occurred in the K.sub.ATP channel gene is considered to be one of the main causal factors of non-insulin dependent diabetes mellitus. For the diversity and functions of inward-rectifying potassium channels, see Horio, Y., SEIKAGAKU, 70(2), p.73-83 (1998), and Seino, S. et al., DIABETES REVIEWS,4(2),177-190 (1996).\nToday, diabetes mellitus is a disease of a national scale, the number of its patients including potential ones reaching several million in Japan and the United States, and 5-10% of the middle aged or over are estimated to be diabetic patients. Diabetes mellitus is classified into type I (IDDM), which is insulin-dependent, and type II (NIDDM), non-insulin dependent, and the non-insulin dependent type II constitutes more than 90% of the number of the diabetic patients. As aforementioned, the relation between non-insulin dependent diabetes mellitus and ATP-sensitive potassium channels is being gradually elucidated. However, development of new therapeutics for NIDDM has been retarded, for such experimental animals have not been obtained that are necessary for enabling studies on the physiological role of the ATP-sensitive potassium channel and for enabling thereupon development of improved therapeutics for non-insulin dependent diabetes mellitus.\nOn the other hand, creation of model animals which have mutation"} +{"output_text": " relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer interconnection structure.\nIn recent years, the degree of integration of semiconductor devices has been increased, and the number of interconnection layers has been increased. In order to increase the number of interconnection layers, it is necessary to reduce the thickness of interconnection layers. However, if the thickness of interconnection layers is reduced, the resistance of the interconnection layers is increased", "input_text": "1, NS2, NS3, NS4, and NS5; and the genes related to virus growth, etc. are primarily included in downstream regions from NS3.\nHCV RNA polymerase is related to the transcription and replication of genomic RNA, and plays an important role in the reproduction of HCV. The gene encoding this polymerase is thought to be included in the above-mentioned NS5 region (Z. H. Yuan et al., Biochemical and Biophysical Research Communications 232, 231-235(1997), S. B. Hwang et al., Virology 227, 439-446(1997), S. E. Behrens et al., The EMBO Journal 15 12-22(1996)).\nThe Problem to Be Solved by the Invention\nIf the gene encoding HCV RNA polymerase can be isolated, it will become possible using this gene to easily screen for substances inhibiting RNA polymerase, and contribute greatly to the development of drugs for treating HCV. However, at present, although the nucleotide sequence of a portion of the NS5 region has been clarified (Japanese Patent Application Laid-Open (Kokai) No. 6-225770), the entire nucleotide sequence of the RNA polymerase gene has yet to be clarified.\nThe object of the present invention is to isolate the gene encoding the full length of HCV-derived RNA polymerase, to determine its nucleotide sequence, as well as to establish its expression system.\nA further object of the present invention is to provide a screening method for a substance which inhibits the activity of this gene or this protein employing this gene or this RNA polymerase protein.\nMeans for Solving the Problem\nIn order to solve the above problem, the present inventors, as result of deliberate and focused research have succeeded in isolating the gene encoding the full-length of HCV-derived RNA polymerase, thereby completing the present invention.\nThat is to say, the present invention"} +{"output_text": " the signal pin 16, the positive ESD surge diode 20 clamps the voltage on the signal pin 16 to one diode drop above the voltage rail 10. The negative ESD surge diode 22 clamps the voltage on the signal pin 16 to one diode drop below the ground rail 12. For negative ESD surges on the signal pin 16, the positive ESD surge diode 20 clamps the voltage on the signal pin 16 to one diode drop below the voltage rail 10. The negative ESD surge", "input_text": " a conventional ESD protection circuit in this regard. As illustrated in FIG. 1, a voltage rail (Vdd) 10 and a ground rail (GND) 12 are provided to power a protected circuit 14. The protected circuit 14 can be any type of circuit and provided in any form desired. In this example, a terminal in the form of a signal pin 16 provides a signal path to the protected circuit 14 for providing information and/or control to the protected circuit 14. For example, the protected circuit 14 may be included in an IC, with the signal pin 16 being an externally available pin on the IC chip.\nA conventional ESD protection circuit 18 may be coupled between the voltage rail 10 and ground rail 12 to protect the protected circuit 14 from ESD surges. The exemplary ESD protection circuit 18 in FIG. 1 includes two conventional diodes: a positive ESD surge diode 20 and a negative ESD surge diode 22. The positive ESD surge diode 20 and the negative ESD surge diode 22 are coupled in series. The positive ESD surge diode 20 clamps positive voltage on the signal pin 16 to one diode drop above the voltage rail 10. The negative ESD surge diode 22 clamps negative voltage on the signal pin 16 to one diode drop below the ground rail 12. A cathode (k) of the positive ESD surge diode 20 is coupled to the voltage rail 10. An anode (a) of the positive ESD diode 20 is coupled to the signal pin 16 at a node 24 on the signal path between the signal pin 16 and the protected circuit 14. A cathode (k) of the negative ESD surge diode 22 is also coupled to the node 24 on the signal path from the signal pin 16 to the protected circuit 14. An anode (a) of the negative ESD surge diode 22 is coupled to the ground rail 12.\nFor positive ESD surges on"} +{"output_text": " from the surgical site. The smoke is generated by the electrical discharge and is carried by the airflow to the surgical site. The smoke is a source of contamination and is a source of discomfort to the patient. It is therefore desirable to provide a surgical knife which is capable of removing smoke from the surgical site.\nAnother disadvantage of prior art electrosurgical knives is the inability to provide a light source for illuminating the surgical site. The light source is typically a fiber-optic cable which is connected to", "input_text": " 3,825,004 referenced above, and is further provided with an aperture in the knife handle communicating with an opening in the vacuum tube. By selectively covering the aperture in the handle with a finger, the surgeon controls the amount of suction applied to the surgically treated area.\nU.S. Pat. No. 4,562,838 to W. S. Walker (dated Jan. 7, 1986) discloses an electrosurgical knife having a generally cylindrical housing and an electrode extending from a central opening in the housing at the distal end thereof. The housing is provided with a number of ducts at the distal end thereof in communication with a cavity internal to the housing. A hose, which may be either connected to a sterile air-pressure source or a vacuum source, is connected to the fluid cavity within the housing and may be used to aspirate smoke or to distribute an airflow in the area of the surgical blade. The housing is further provided with a mounting channel along its upper edge for slidably receiving a light-transmitting cable of a fiber-optic system to illuminate the region around the cutting blade.\nDisadvantages of these and other prior art devices are that the electrosurgical knives have become relatively bulky and complex structures which are not inexpensive to manufacture. The prior art devices are often difficult to hold and lack the flexibility that is desired by many surgeons. Surgeons in a number of hospitals use the less expensive standard electrosurgical knives which do not have the aspirating capability or a separate light source to perform operations in well-ventilated and well-lighted areas, and use the more expensive, specialized knives only when required. This means that the hospitals must have multiple inventories. It is therefore desirable to provide a surgical knife which is inexpensive and optionally provides the capabilities of the more specialized knives.\nA particular disadvantage of prior art electrosurgical knives is the inability to withdraw smoke"} +{"output_text": "orientation or the inability to stretch the film beyond a certain point.\nU.S. Pat. No. 4,677,017 discloses a process for producing a stretched film of a fluoropolymer such as PCTFE. The process includes the steps of extruding a fluoropolymer melt through a die to form a film, stretching the film in the machine direction and then in the transverse direction, and then heat treating the film to increase the degree of orientation.\nU.S. Pat.", "input_text": " static IP configuration discussed above in reference to IPv4 reproduce themselves with DHCPv6 and static IP configuration in an IPv6 network. 1. Field of the Invention\nThe present invention relates to oriented multilayer films. More particularly, the invention pertains to coextruded or laminated films having at least one layer of a fluoropolymer such as poly(chlorotrifluoro ethylene) (PCTFE) homopolymer or copolymer, a layer of a polyolefin homopolymer or polyolefin containing copolymer and an intermediate adhesive layer of a polyolefin having at least one functional moiety of an unsaturated carboxylic acid and/or anhydride thereof.\n2. Description of the Prior Art\nIt is well known in the art to produce oriented polymeric films. See, e.g. U.S. Pat. No. 4,011,874. However, such films tend to expand in the direction perpendicular to the direction of stretching.\nIt is also known in the art to produce single layer and multilayer fluoropolymer films. See, e.g. U.S. Pat. Nos. 4,677,017; 4,659,625 and 5,139,878, all of which are incorporated herein by reference. However, fluoropolymers are difficult to orient due to their unique crystallization properties. More particularly, PCTFE is exceptionally difficult to orient due to its extremely fast crystallization rate and thermally induced self-orientation. Its fast crystallization rate produces a highly crystalline structure that hinders orientation and actually prevents further orientation beyond a certain point. Its thermally induced self-orientation results in a film which, upon unconstrained heating, self extends in the machine or longitudinally stretched direction and shrinks in the transverse direction.\nMost earlier attempts to stretch PCTFE films have failed either due to its high degree of film crystallinity, nonuniform crystallinity, self-"} +{"output_text": " the second loop.\nIn another embodiment, the first object fastening receptacle is a cleat having two legs, a cross piece joining the two legs and, with the first object, forming a cleat opening between the legs, and further wherein the first loop has a distal end and a length such that the first loop can be placed to loop the two cleat legs and the first loop distal end can then be drawn through the cleat opening as the first loop distal end is routed back to", "input_text": " the elongated second loop can be routed to encompass the second object fastening receptacle and then routed back to the distal loop, the apparatus further comprising a second fastening member for securing the distal loop to the second strap portion second loop when the second loop is routed back to the distal loop from the second object fastening receptacle.\nIn another embodiment, the second fastening member has a first closing hook for hooking and closing about the second strap portion distal loop, and a second closing hook for hooking and closing about the second strap portion second loop.\nIn another embodiment, the first object fastening receptacle is a cleat having two legs, a cross piece joining the two legs and, with the first object, forming a cleat opening between the legs, and further wherein the second loop has a distal end and a length such that the second loop can be placed to loop the two cleat legs and the second loop distal end can then be drawn through the cleat opening as the second loop distal end is routed back to the first loop, wherein it is fastenable to the first loop using the fastening member.\nIn another embodiment, the second object fastening receptacle is a cleat having two legs, a cross piece joining the two legs and, with the second object, forming a cleat opening between the legs, and further wherein the second strap portion second loop has a distal end and a length such that the second loop can be placed to loop the two cleat legs and the second loop distal end can then be drawn through the cleat opening as the second loop distal end is routed back to the second strap portion distal loop, wherein it is fastenable to the distal loop using the second fastening member.\nIn another embodiment, the fastening member has a first closing hook for hooking and closing about the first loop, and a second closing hook for hooking and closing about"} +{"output_text": " when the gaps between the Si wafers 8 are increased to 10 mm, the sheet resistances of the SiGe epitaxial growth films formed on the Si wafers 8 are reduced to 1498 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2200 \u03a9/\u25a1, 2", "input_text": "zzles renders the vertical type low pressure CVD apparatus rather costly.\nFurther, when gaps between the Si wafers 8 mounted in the standard boat 5 are small, it is difficult for sufficient amounts of B2H6 to diffuse to the central portions of the respective Si wafers 8. This causes non-uniformity in the concentration of B even within a same Si wafer, exhibiting a concentration gradient of B in the Si or the SiGe epitaxial growth film lowering from the peripheral portion towards the central portion thereof.\nFor instance, when SiGe epitaxial growth films were grown on the Si wafers 8 by loading into the reaction chamber 40 the Si wafers 8 of a diameter of 200 mm mounted in the standard boat 5 with a gap of 5 mm therebetween and then introducing thereinto B2H6 as the doping gas, the sheet resistances, at the location of A to I shown in FIG. 3, of a SiGe epitaxial growth film formed on a Si wafer 8 were 1498 \u03a9/\u25a1, 2640 \u03a9/\u25a1, 2800 \u03a9/\u25a1, 2510 \u03a9/\u25a1, 1463 \u03a9/\u25a1, 1633 \u03a9/\u25a1, 2600 \u03a9/\u25a1, 2650 \u03a9/\u25a1, 2070 \u03a9/\u25a1 respectively, showing the variation of sheet resistances of \u00b130.3%. In general, when B is doped into the SiGe epitaxial growth film, the sheet resistances thereof become greater as the concentration of B in the SiGe epitaxial growth film becomes lower. Therefore, from the data described above, it is evident that the concentration of B is not uniform at the surface of a SiGe epitaxial film.\nIn order to solve the foregoing problems, it may be necessary to increase the gaps between the Si wafers 8 mounted in the standard boat 5. For example,"} +{"output_text": " and are relatively insensitive to temperature and process variations. The band-gap reference circuit is typically implemented as a voltage-to-current converter that is configured to generate a current that is proportional to the voltage across the band-gap reference circuit. The current is then used to bias a current mirror that is configured to generate a reference current that is proportional to the reference current. The reference current is then used to bias a voltage-to-current converter that is configured to generate a reference voltage that is proportional", "input_text": " efficient method of depressurizing, perhaps also accommodating for multiple sources of different base components, has been found to be desirable.\nSimilarly, different beverages are formed from concentrates that are only slightly different from each other. For example, customers are increasing interested in enjoying beverages that include a supplemental flavor in addition to a base flavor. One popular supplemental flavor is cherry. For example, some consumers enjoy cola-flavored beverages with cherry flavoring and others lemon lime-flavored beverages with cherry flavoring. In presently known dispensing units, in order to provide customers with different beverages, and the supplemental-flavored versions of these beverages, it is necessary to provide a dispensing head for each of these beverages. As discussed above, this results in providing a counter-top assembly that is very large. Moreover, this would also require a large volume of behind-the-counter space in order to store the different types of concentrate that are required. The present invention relates generally to reference voltage circuits and, more particularly, to a band-gap reference circuit with operational amplifier offset cancellation.\nThe rapid proliferation of local area network (LANs) in the corporate environment and the increased demand for time-sensitive delivery of messages and data between users has spurred development of high-speed (gigabit) Ethernet LANs. The 100BASE-TX Ethernet LANs using category-5 (CAT-5) copper wire and the 1000BASE-T Ethernet LANs capable of one gigabit per second (1 Gbps) data rates over CAT-5 data grade wire use new techniques for the transfer of high-speed data symbols.\nConventional 1000BASE-T Ethernet LAN drivers, in addition to nearly all other signal processing/communication chips and systems, use band-gap reference circuits. These band-gap reference circuits are able to generate relatively constant reference voltages that have a well-defined magnitude"} +{"output_text": " of the multiplication operation.\nThe execution unit 201 includes a multiplier 207 and an adder 208. The multiplier 207 receives the multiplicand A and the multiplier B, and performs a multiplication operation on the multiplicand A and the multiplier B. The adder 208 receives the multiplicand A and the multiplier B, and adds the multiplicand A to the multiplier B to produce the result of the multiplication operation.\nThe execution unit 201 also includes a register file 209. The register file 209 stores the multiplic", "input_text": " its data transmission with the base station. Therefore, a decreasing of data transmission speed and a degrading of quality of service can be caused due to the termination of the data transmission in the related art. 1. Field of the Invention\nThe present invention relates to the field of computer arithmetic, and in particular to a method and apparatus for efficient binary multiplication.\n2. Art Background\nA number of techniques and algorithms exist for performing multiplication in software and computer hardware. With the growth of applications requiring fast multiplication, computer designers have found it necessary to turn to hardware solutions to implement multiplication. However, fast multiplication is a very hardware intensive operation, requiring a large number of devices that occupy a sizable amount of real estate in an integrated circuit. Thus, it is a goal of the computer designer to achieve fast multiplication using the smallest number of devices necessary.\nMany components of a computer system would benefit from smaller fast multipliers. FIG. 1 is a standard block diagram of a computer system including a CPU 101 and an arithmetic coprocessor 102. A main system bus 103 links processors 101 and 102 to each other and to a main memory 104 and an I/O processor 105. The I/O processor 105 links the processing units 101 and 102 and the main memory 104 through an I/O expansion bus 106 to various I/O devices, including a secondary memory 107 through a memory controller 108, a printer 109 and a keyboard 110 through I/O controller 111, and to a monitor 112 through a graphics controller 113.\nFIG. 2 illustrates CPU 101 in greater detail. An execution unit 201 executes arithmetic operations according to instructions fetched by an instruction sequencer 202 from an instruction cache 203. In a multiplication operation in the CPU, the operands, multiplicand A and multiplier B, are provided to the execution unit 201 by register file 204 over operand buses 205. A result bus 206 carries the result"} +{"output_text": "IBER cage, that are designed to be smaller than the disc space. The smaller size of these cages allows the surgeon to insert the cage into the disc space and position it more precisely. However, the smaller size of these cages can make it difficult to insert the cage into the disc space, and the smaller size of the cage can make it difficult to position the cage in the disc space.\nThe present invention is directed to a system and method for providing a surgical instrument that can be used to", "input_text": " herniation, or perhaps to correct a prior surgery.\nBone graft material is introduced for fusion and a fusion cage can be inserted to help support the disc space during the fusion process. In fact, fusion cages are frequently used in such procedures to support and stabilize the disc space until bone graft unites the bone of the opposing vertebral endplates in the disc space. A transforaminal lumbar interbody fusion (TLIF), for example, involves placement of posterior instrumentation (screws and rods) into the spine, and the fusion cage loaded with bone graft can be inserted into the disc space. Bone graft material can be pre-packed in the disc space or packed after the cage is inserted. TLIF can be used to facilitate stability in the front and back parts of the lumbar spine promoting interbody fusion in the anterior portion of the spine. Fusion in this region can be beneficial, because the anterior interbody space includes an increased area for bone to heal, as well as to handle increased forces that are distributed through this area.\nUnfortunately, therein lies a problem solved by the teachings provided herein. Currently available systems can be problematic in that the methods of introducing the fusion cage and bone graft material leaves pockets in regions of the intervertebral space that are not filled with bone graft material, regions in which fusion is desired for structural support. These pockets can create a premature failure of the fused intervertebral space due to forces that are distributed through the regions containing the pockets, for example, when the patient stands and walks.\nTraditional fusion cages, such as the Medtronic CAPSTONE cage, are designed to be oversized relative to the disc space to distract the disc space as the entire cage is inserted. However, this makes it difficult to insert and position properly. In response to the problem, the art has developed a number of new fusion cages, such as the Globus CAL"} +{"output_text": ". For example, the World Wide Web Consortium has recently defined a next generation markup language, called \u201cXML\u201d (eXtensible Markup Language), which is intended to be a universal language for structuring data and documents. XML is intended to be a more powerful and flexible replacement for HTML.\nThe Internet is a global network of cooperatively interconnected computer networks that use the standard Internet protocols. The Internet has become a cultural fixture as a source of both information and entertainment. Many businesses are creating", "input_text": " volatiles includes an airless grinder which is operated under pressure so that air leakage is minimized, enabling the use of low cost equipment.\nIn yet another aspect, the process includes both airless grinding and steam heating operations, which are performed substantially simultaneously under pressure. This simplifies the construction of a system by eliminating some of the components required in the prior art setups. The Internet is a well-known, global network of cooperatively interconnected computer networks. The World Wide Web portion of the Internet is a collection of server computers (referred to as \u201cweb sites\u201d) on the Internet, which store hyper text transfer language \u201cHTML\u201d documents that can be publicly accessed by computer users having a connection to the Internet.\nMost basically, the Internet comprises a network of computer networks capable of transmitting messages to one another using a common set of operating rules, called communication protocols. Networks comprise addressable devices (computers) connected or \u201clinked\u201d by communication channels. More specifically, the World Wide Web\u2014comprises an amalgamation of linked-together \u201cweb pages\u201d accessible by linked web-based users, with the web pages typically presenting information to the user in a graphical fashion.\nWorld Wide Web (\u201cweb\u201d) is used herein to refer generally to both (i) a distributed collection of interlinked, user-viewable Hypertext documents (commonly referred to as web documents or web pages) that are accessible via the Internet, and (ii) the client and server software components that provide user access to such documents using standardized Internet protocols. Currently, the primary standard protocol for allowing applications to locate and acquire web documents is hypertext transfer protocol (HTTP), and the web pages are encoded using HTML.\nHowever, the terms \u201cweb\u201d and \u201cWorld Wide Web\u201d are intended to encompass future markup languages and transport protocols that may be used in place of (or in addition to) HTML and HTTP"} +{"output_text": " position and the GPS receiver's position is then used to correct the GPS receiver's position determination.\nThe DGPS technique is well known in the art and is described in detail in U.S. Pat. No. 5,828,336 issued to Yunck, et al. which is incorporated herein by reference in its entirety.\nThe DGPS technique is also well known in the art and is described in detail in U.S. Pat. No. 5,828,", "input_text": " speed of electromagnetic radiation (i.e., the L-band radio signal). While the propagation speed of electromagnetic radiation is constant in a vacuum, it is retarded by passage through matter such as air in the atmosphere. The amount of speed alteration (i.e., delay) caused by the atmosphere will depend on the thickness of the air layer traversed, temperature, and a variety of other atmospheric conditions.\nApart from the \u201cnatural\u201d category of errors in pseudorange determination and in determination of precise satellite positions, GPS also contains the capability to produce purposeful errors\u2014known as selective availability (\u201cSA\u201d)\u2014which can be introduced by the U.S. military. That is, in order to prevent the precision of GPS positioning from being used by the wrong persons, the military has the capability to introduce purposeful random errors into the clock signal broadcast by the GPS satellites. This has the effect of further degrading the accuracy of the pseudorange determinations and, hence, the accuracy of the coordinates determined for the GPS receiver.\nA more detailed discussion of both the so-called \u201cnatural\u201d and \u201cmilitary\u201d categories of errors affecting the accuracy of GPS receivers can be found in U.S. Pat. No. 5,828,336 issued to Yunck, et al. which is incorporated herein by reference in its entirety.\nA known method of improving the accuracy of a (standalone) GPS receiver's position determinations in spite of the above-mentioned category of errors is known as Differential GPS (DGPS). In this technique, one or more additional known locations are added to the GPS determination. Essentially, one or more ground stations in the general vicinity of a moving GPS receiver simultaneously receive the GPS signals and determine their own positions. Because the ground stations are stationary, any change in their determined position must be due to GPS error, either natural or military. The delta value between the ground station's"} +{"output_text": " outwardly through the motor, and exits axially across a low resistance grille. The air flow is forced through the motor by the centrifugal force of the rotating blades. The air flow is forced through the motor by the centrifugal force of the rotating blades. The air flow is forced through the motor by the centrifugal force of the rotating blades. The air flow is forced through the motor by the centrifugal force of the rotating blades. The air flow is forced through the motor by the centrifug", "input_text": " at higher pressures, U.S. Pat. No. 3,526,386 discloses a plastic valve having a metallic sleeve which is inserted in and embedded within the flow line thereof. The sleeve is said to \"reinforce\" the flow line but is primarily utilized to provide a stronger, reinforced coupling means for installing the plastic valve within a piping system.\nU.S. Pat. No. 4,171,711 also discloses a bottom cap member to permit leakage if the fluid pressure within the valve body increases above a given amount. U.S. Pat. No. 4,171,711 also discloses valve seats positioned with the cylindrical plug with key portions as guides for the seal.\nU.S. Pat. No. 4,511,120 discloses a rotary plug type plastic service valve that has a unitary plug and actuator member where the actuator portion of the valve has a flange portion so that impact forces applied to the actuator are transmitted away from the stem and plug portions and distributed throughout the valve body.\nConsequently, there remains a need for providing a plastic valve which includes means for insuring that the valve can be satisfactorily employed in fluid systems of higher pressure without any detriment thereto over an extended period of time. This invention relates to a dynamoelectric machine having a quiet cooling system. A typical example of the state of the art in cooling systems for dynamoelectric machines prior to this invention is the cooling system for a totally enclosed fan cooled motor. That system comprises a motor having cooling fins axially disposed about its periphery, a rotor shaft extending beyond the body of the motor, blades disposed on a hub which is clamped to the extended portion of the rotor shaft, a casing which totally encloses the motor, and a protective grille attached to the motor casing. In these motors the air flow enters axially across a high resistance grille, abruptly turns ninety degrees, is forced radially"} +{"output_text": ". In an embodiment, the additive is a polyamine, such as spermidine or spermine. In an embodiment, the additive is a polyol, such as glycerol or ethylene glycol. In an embodiment, the additive is a polyamine, such as spermidine or spermine, and a polyol, such as glycerol or ethylene glycol. In an embodiment, the additive is a polyamine, such as spermidine or spermine, and a polyol, such as glycerol or", "input_text": "-neoplastic gene therapy and other gene therapy formats.\nThe invention provides a method for generating an enhanced green fluorescent protein (GFP) and polynucleotides encoding same, comprising performing DNA shuffling on a GFP encoding expression vector and selecting or screening for variants having an enhanced desired property, such as enhanced fluorescence. In a variation, an embodiment comprises a step of error-prone or mutagenic amplification, propagation in a mutator strain (e.g., a host cell having a hypermutational phenotype; mutL, etc.; yeast strains such as those described in Klein (1995) Proqr. Nucl. Acid Res. Mol. Biol. 51: 217, incorporated herein by reference), chemical mutagenesis, or site-directed mutagenesis. In an embodiment, the enhanced GFP protein comprises a point mutation outside the chromophore region (amino acids 64-69), preferably in the region from amino acid 100 to amino acid 173, with specific preferred embodiments at residue 100, 154, and 164; typically, the mutation is a substitution mutation, such as F100S, M154T or V164A. In an embodiment, the mutation substitutes a hydrophilic residue for a hydrophobic residue. In an embodiment, multiple mutations are present in the enhanced GFP protein and its encoding polynucleotide. The invention also provides the use of such an enhanced GFP protein, such as for a diagnostic reporter for assays and high throughput screening assays and the like.\nThe invention also provides for improved embodiments for performing in vitro sequence shuffling. In one aspect, the improved shuffling method includes the addition of at least one additive which enhances the rate or extent of reannealing or recombination of related-sequence polynucleotides. In an embodiment, the additive is polyethylene glycol (PEG), typically added to a shuffling reaction to a final concentration of 0.1 to percent, often to a final concentration of 2.5 to 15 percent"} +{"output_text": " course, the magnetic rod is not a homopolar generator.\nThe second type of homopolar generator is the radial field type, such as the motor described above, wherein the magnetic field is radially oriented and the electric field is axially oriented. For a radial field homopolar generator or motor, the third type, the magnetic field is oriented axially and the electric potential is radially oriented. An example of a simple axial field type homopolar motor is disclosed in \"The axial magnetic", "input_text": "ating homopolar generators is explained in greater detail in two articles which are incorporated herein by reference: \"One-piece Faraday generator: A paradoxical experiment from 1851\", Crooks et al., American Journal of Physics, Vol. 46, No. 7, p. 729-731, July 1978; and \"Electromagnetic Induction in Moving Systems\", Corson, American Journal of Physics, Vol. 24, p. 126, 1956. As explained in Crooks et al., the co-rotating generator is not novel and is often classified as a paradox to Faraday's law. However, it is the simplification of the electromagnetic theory to the concept of \"flux-cutting\" that creates difficulty in comprehending the co-rotating homopolar generator.\nIn general, there are two types of homopolar generators and motors. The first type is the axial field type, such as the generator described above, wherein the magnetic field is axially oriented and the electric field is radially oriented. For a radial field homopolar generator or motor, the second type, the magnetic field is oriented radially and the electric potential is axially oriented. An example of a simple radial type homopolar motor is disclosed in \"The radial magnetic field homopolar motor\", Eagleton et al., American Journal of Physics, Vol. 56, No. 9; p. 858-859, September 1988. This radial motor is comprised of a stainless steel tube having a contrapolarized magnetic rod therein. The steel tube and the magnetic rod are supported by separate bearings so that they are able to rotate with respect to each other along the longitudinal axis of the apparatus. Two electrical contacts, which are spaced apart from each other, are operatively connected to the steel tube. By providing electrical current to the tube, the tube rotates, but the magnetic rod does not rotate. Of"} +{"output_text": " be divided into two main categories: (i) those that use lithographic techniques to define the channels and (ii) those that use etching techniques to define the channels. The former category includes for example, photolithography, wet chemical etching and dry chemical etching. The latter category includes for example, reactive ion etching (RIE), plasma etching and deep reactive ion etching (DRIE).\nThe fabrication of microfluidic devices using lithographic techniques is well established and is described for example in", "input_text": " than 1 mm, whilst nanofluidic devices will have generally smaller channels. With devices measured at the micrometer level and fluids measured in nanoliters and picoliters, microfluidics devices are widely used for example in biotechnology or biochemistry.\nThese devices can be used to handle a wide variety of liquids sample types. However, they are particularly useful in biochemical research or diagnostics in particular clinical diagnostics, where they may be used to handle liquids such as blood samples (including whole blood or fractions such as blood plasma), bacterial cell suspensions, protein or antibody solutions and other reagents including organic solvents, buffers and salts. Depending upon the nature and arrangement of the microfluidic device, it can be used in a wide range of analytical techniques and methods including for example, the measurement of molecular diffusion coefficients, fluid viscosity, pH, chemical binding coefficients and enzyme reaction kinetics. Other applications for microfluidic devices include capillary electrophoresis, isoelectric focusing, immunoassays, flow cytometry, sample injection of proteins for analysis via mass spectrometry, amplification of nucleic acids for example using amplification reactions such as the polymerase chain reaction (PCR), DNA and protein analysis, cell manipulation, cell separation, cell patterning and chemical gradient formation, high through-put screening, micro chemical manufacture, cell based testing of drug candidates, patient monitoring, proteomics and genomics, chemical microreactions, protein crystallisation, drug delivery, scale-up to manufacturing of drugs, security and defense.\nThe use of microfluidic devices in carrying out biomedical research and analysis has a number of significant advantages. First, because the volume of fluids within these channels is very small, usually several nanoliters, the amount of reagents and analytes used is quite small. This is especially significant for expensive reagents or where reagents are scarce, for example in some diagnostic applications or forensic DNA analysis.\nThe fabrications techniques used to construct microfluidic devices can"} +{"output_text": " a digest of the major subtopics. The invention also incorporates methods of identifying and summarizing the major subtopics. The invention also incorporates methods of identifying and summarizing the major subtopics, and of identifying and summarizing the major subtopics, and of identifying and summarizing the major subtopics, and of identifying and summarizing the major subtopics, and of identifying and summarizing the major subtopics, and of identifying and summarizing the major subtopics", "input_text": " method processes the thread tree bottom-up and, at each step, combines a parent with currently open child subtrees, separately or together, if the similarity between the parent word vector and the centroid vector of the child subtree or subtrees exceeds an (unspecified) absolute input threshold. The word vectors used to represent the vectors handle quoted passages by reducing the weights of quoted words, in order to keep inter-message distances from being too small. While no results are given, if the threshold is set relatively high, this method would probably lead to shallow subtrees, suitable as query results. However, it is unlikely that the method would lead to clustering results suitable for subtopic identification or digesting. Based on our experiments, quoted words require more detailed treatment, and some trials of a similar single-link clustering method using distances between a node and the centroid of an adjacent cluster produced unsatisfactory results.\nAnother approach related to discussion tree segmentation is described in a paper by H. Ozaku, K. Uchimoto, M. Murata, and H. Isahara entitled \u201cTopic Search for Intelligent Network News Reader HISHO\u201d, in the Proceedings of the 2000 ACM Symposium on Applied Computing. This paper describes a method for retrieving many discussions relating to a query topic, and then attempting to filter out discussion subtrees irrelevant to the topic. The method uses, for the most part, noun keywords to represent messages, and tries to find \u201ctopic changing articles\u201d where the proportion of never-seen-keywords shifts, and \u201ctopic branching articles\u201d where a message gives rise to several responses distinguished by their keyword usage and their referenced quotes. This strategy is reported as of limited success in finding topic-changing articles (recall=57%) and larger success in finding topic branching articles.\nThe present invention incorporates methods of dividing a tree-structured discussion into major subtopics, and of developing"} +{"output_text": " active agents.\nThe invention also relates to the use of these novel collagenic peptides as biomaterials for the manufacture of medical, surgical or cosmetic products, such as artificial tissues or organs, artificial skin, bone, ligament, cardiovascular, intraocular, intraperitoneal, etc. prostheses or implants, or alternatively bioencapsulation systems (implants, microspheres or microcapsules) allowing the sustained and controlled release of active agents.\nThe invention also relates to the use of these novel collagen", "input_text": " semaphores from multiple operating system personalities and in an efficient manner that is transparent to those personalities.\nThe present invention is directed to providing a generic semaphore operation that is able to emulate and respond to semaphore API\"\"s from multiple operating system personalities. Semaphore operations of the present invention allow concurrent resource control and process synchronization from multiple operating system personalities. This enables a single resource to be used by applications using different operating system personalities.\nIt is therefore an object of the present invention to provide a single set of semaphore operations that emulate and support multiple concurrent operating systems.\nIt is yet another object of the present invention to provide efficient semaphore operations that do not require significant overhead to process.\nThe foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of a preferred embodiment of the invention, as illustrated in the accompanying drawing wherein like reference numbers represent like parts of the invention. The present invention relates to novel collagenic peptides chemically modified by grafting free or substituted thiol functions, borne by mercaptoamino residues. When the collagenic peptides comprise thiol functions, they have the property of being crosslinkable by oxidation and give a collagen derivative crosslinked with disulfide bridges.\nThe invention is also directed toward a process for preparing these novel collagen derivatives which are in crosslinkable form, in the form of a crosslinkable precursor of a derivative or in crosslinked form.\nThe invention also relates to the uses of these novel collagenic peptides as biomaterials that are useful as starting materials for the manufacture of medical, surgical or cosmetic products, such as artificial tissues or organs, artificial skin, bone, ligament, cardiovascular, intraocular, intraperitoneal, etc. prostheses or implants, or alternatively bioencapsulation systems (implants, microspheres or microcapsules) allowing the sustained and controlled release of"} +{"output_text": " of the probe.\nThe scattered light is detected by a photodetector, and the AFM measurement is carried out by detecting a shift of the probe in accordance with a force acting on the probe.\nThe SNOM is a means for measuring concaves and convexes of a sample by detecting a shift of the probe in accordance with a force acting on the probe.\nThe SNOM is a means for measuring concaves and convexes of a sample by detecting a shift of the probe in", "input_text": "\nMeanwhile, the AFM has been most widely spread as an apparatus for obtaining topography information of the sample surface among SPMs.\nThe AFM detects a displacement of a cantilever which shifts in accordance with a force acting on a probe when the prove supported on the top end of the cantilever is set near a sample surface, for example, by an optical displacement sensor, thereby to obtain indirectly information concerning concaves and convexes of the sample surface.\nOne of the AFMS is disclosed in the Japanese Patent Application KOKAI Publication No. 62-130302.\nThe technique of measuring concaves and convexes on a sample by detecting a correlative force between the sample and the top end of the probe is utilized for other SPM apparatuses and is used as a means for carrying out so-called regulation.\nN. F. van Hulst et al. has proposed a new SNOM which uses an AFM cantilever made of silicon nitride and detects optical information of a sample while measuring concaves and convexes of the sample by AFM measurement, in xe2x80x9cAppl. Phs. Lett. 62(5)xe2x80x9d, P. 461 (1993).\nIn this apparatus, the sample is set on an internal total reflection prism and the sample is illuminated with a He-Ne laser beam from the total reflection prism side, so the sample is excited and an evanescent optical field is formed near the sample surface.\nSubsequently, a probe supported on the top end of the cantilever is inserted in the evanescent optical field, and evanescent light as a localized wave is converted into scattered light as a propagation wave. A part of this light is propagated inside a silicon-nitride-made probe which is substantially transparent with respect to the He-Ne laser beam and passes to the back side"} +{"output_text": "-ferrous Smelting and Energy Saving, published by the Non-ferrous Smelting Association of Japan).\nHowever, the above-mentioned conventional methods have the following problems.\n(1) The reduction of magnetite is not sufficient.\n(2) The reduction of magnetite is not uniform.\n(3) The reduction of magnetite is not sufficient in the case of ultra-fine coke.\n(4) The reduction of magnetite is not sufficient in", "input_text": " smelting furnace so as to decrease the copper loss in the tapped slag and also to minimize fuel consumption (Japanese Unexamined Patent Publication No. 58-221,241). According to descriptions of this publication, since the metallurgical reactions suddenly occur in the oxidizing atmosphere of the flash-smelting furnace, a large amount of Fe.sub.3 O.sub.4, which is a peroxide of iron, is formed and contained in the slag. The unburnt powder coke, which covers the slag, is therefore, caused to react with the magnetite and reduces it. The copper loss in the slag is decreased along with reduction of magnetite.\nIn addition, according to Japanese Unexamined Patent Publication No. 58-221,241 mentioned above, there are descriptions about the following preferred methods: the powder coke is added in the reaction shaft of a flash smelting furnace in such a manner that the entire surface of melt in the settler is uniformly covered with the unburnt powder coke; regarding the grain size of coke, since the degree of reduction of magnetite decreases when the grain size is ultra-fine, grain size is preferably from 16 mesh (1 mm) to 325 mesh (44.mu.m); and the carbonaceous material should have a high content of volatile matters.\nSaganoseki Smelter, which belongs to the present Assignee, used, in a flash-smelting furnace, powder coke having the following distribution of grain sizes and attained from 2 to 4% of magnetite level in the slag. Also, consideration was given to the fact that the unburnt coke, which floats on the slag surface, reduces a portion of the magnetite (\"Non-ferrous Smelting and Energy Saving\" (1985) edited by Research Committee Concerning Non"} +{"output_text": " longitudinal control arm is constructed as a control arm branch, the second control arm part of the longitudinal control arm can be constructed as a control arm branch, whereby the second control arm part can be constructed as a control arm branch in a manner that is similar to the first control arm part, but with a different geometrical design.\nIn order to realize a particularly simple and cost-effective design of the longitudinal control arm according to the invention, it may be helpful that the longitudinal control arm is constructed as a", "input_text": " by a longitudinal control arm according to the invention and its mobility in the vertical direction is limited only to an extent that this longitudinal control arm at least proportionately takes over the function of a main spring clamped in between the wheel carrier and the vehicle body. Therefore, in addition to the longitudinal control arm according to the invention, an additional spring element may also be clamped in at a vehicle wheel suspension according to the invention between the vehicle body and (finally) the wheel carrier, which then, however, can have significantly smaller dimensions than in the case of the conventional state of the art (of a control blade design axle). A longitudinal control arm according to the invention will, however, preferably completely take over the function of an otherwise customary main spring, so that, according to the state of the art, only a compression stop spring and/or rebound stop spring is still provided, which is integrated in a vibration damper functionally connected parallel to the main spring. This compression stop spring and/or rebound stop spring acts or can act between the wheel carrier and the vehicle body.\nIn order to realize different spring rates of the main spring as well as for the adaptation to different vehicle weights in the design position, it may be helpful that the longitudinal control arm according to the invention is constructed with several control arm branches, particularly when the geometrical and material-related scope of design options is not sufficient using a single control arm or control arm branch, for meeting the required application range of the axle. The above-mentioned longitudinal control arm can therefore be formed by two or more, in a wide area, individual control arm parts, which are preferably situated essentially in a common vertical plane, whereby, on the one hand, a considerable degree of freedom exists with respect to the geometrical design of this longitudinal control arm and therefore also its suspension characteristics and other features. On the other hand, particularly when the first control arm part of the"} +{"output_text": " constitute a front-end coil end group.\nIn the conventional stator 50 constructed in this manner, the rear-end coil end group and the front-end coil end group are arranged in a radial direction to constitute a stator winding.\nIn the conventional stator 50 constructed in this manner, the conductor segments 54 are inserted into the slots 51a of the stator core 51 in a radial direction.\nIn the conventional stator 50 constructed in this manner, the conductor segments 54 are inserted into the slots 51", "input_text": " at the front end from the fourth positions from the outer circumferential side within the second slots 51a six slots away in a clockwise direction from the first slots 51a are joined to form an inner layer winding having two turns.\nIn addition, the inner layer winding and outer layer winding constituted by the conductor segments 54 inserted into the pairs of slots 51a six slots apart are connected in series to form one phase of the stator winding 52 having four turns.\nA total of six phases of the stator winding 52 each having four turns are formed in this manner. Then, two sets of three-phase stator winding portions are constructed by connecting three phases each of the stator winding 52 into alternating current connections.\nIn the conventional stator 50 constructed in this manner, at the rear end of the stator core 51, turn portions 54c of the pairs of conductor segments 54 inserted into the same pairs of slots 15a are lined up in rows in a radial direction. As a result, the turn portions 54c are arranged in two rows circumferentially to constitute a rear-end coil end group.\nAt the front end of the stator core 51, on the other hand, joint portions formed by joining the end portions 54b of the conductor segments 54 extending outwards at the front end from the first positions from the outer circumferential side within the first slots 51a and the end portions 54b of the conductor segments 54 extending outwards at the front end from the second positions from the outer circumferential side within the second slots 51a six slots away, and joint portions formed by joining the end portions 54b of the conductor segments 54 extending outwards at the front end from the third positions from the outer circumferential side within the first slots 51a and the end portions 54b of the conductor segments 54 extending outwards at the front end from the fourth positions from the outer circumferential side within the second slots 51a six slots away are arranged to"} +{"output_text": " elements is no longer aligned with the plane of the frame of the body carrier. In this case, the locking nose of the body carrier can be pushed through between the two locking elements of the conveyor hanger, so that the body carrier can be lifted from the conveyor hanger.\nIn order to prevent this, it is proposed in DE-40 41 211-C1 to provide the locking nose with a locking projection, which extends in the direction of the plane of the frame of the body carrier", "input_text": "wards at an angle and towards one another, wherein their lower ends are arranged at a transverse distance from one another so that they form an opening between them, and wherein the angle bisector of the angle formed by the two locking elements extends through the cylinder axis of the circumferential wall of the protective bell.\nThese two locking elements interact with a locking nose, which is secured to the guide element of the lower contacting device, is approximately of a saw tooth shape in the side view and may be pushed through between the two locking elements from below with a horizontally extending body carrier and vertically extending conveyor hanger, but prevent any lifting of the protective bell from the lower contacting device as soon as--when seen transversely to the direction of conveyance and horizontally--the angle formed by the plane of the frame of the conveyor hanger and the plane of the frame of the body carrier deviates from 90.degree. by more than a specific angle. Since--as is apparent from FIG. 1 of DE-40 41 211-C1--the overhead conveyor submerges the body carriers in the paint bath at an angle and also guides them out of the paint bath again at an angle, wherein the plane of the frame of the guide hangers still extends at least approximately vertically even in this case, such angular deviations occur not only during the submergence of the bodies into the paint bath but also during the guidance of the bodies out of the paint bath.\nIt has been shown in practice that the construction resulting from DE-40 41 211-C1 cannot ensure a reliable locking of the body carrier on the conveyor hangers, namely during the submergence of the bodies into the paint bath: In this respect, the conveyor hanger adjacent to the trunk of the body and the body carrier can be pivoted relative to one another such that the angle bisector of the V-shaped configuration formed by the two locking"} +{"output_text": " decommissioning.\nDebt Financing: The utility borrows money to cover decommissioning costs. The utility must pay interest on the debt and must pay the debt off at the end of the decommissioning period.\nEquity Financing: The utility sells bonds to raise money to cover decommissioning costs. The utility must pay interest on the bonds and must pay the bonds off at the end of the decommissioning period.\nThe most common method of financing is prepay", "input_text": " the facility are dismantled or decontaminated while other parts of the facility are left in SAFSTOR. The decision may be based on factors besides radioactive decay such as availability of waste disposal sites. However, most facilities will use either immediate DECON or a DECON after some period of SAFSTOR.\nAs stated, under NRC regulations, decommissioning must be completed within 60 years. A time beyond that will be considered only when necessary to protect public health and safety in accordance with NRC regulations.\nActual Decommissioning Experience\nAs of January 1998, there have only been five plants that have completed the DECON process, three nuclear power plants, and two Department of Energy (xe2x80x9cDOExe2x80x9d) plants. Further, six nuclear power plants are now in various stages of dismantlement and decontamination and eleven nuclear power reactors are currently in long term storage (SAFSTOR).\nDecommissioning Cost Estimates\nThe total cost of decommissioning is dependent on the sequence and timing of the various stages one through three, described above. Deferment of a stage tends to reduce its cost, due to decreasing radioactivity, but this may be offset by increased storage and surveillance costs.\nEven allowing for uncertainties in cost estimates and applicable discount rates, decommissioning contributes less than 5% to total electricity generation costs. In the United States, many utilities have revised their cost projections downwards in the light of experience, and estimates from 1998 now average $325 to $500 million per reactor and up.\nFinancing methods vary; however, the most common methods are:\nPrepayment: Money is deposited in a separate account to cover decommissioning costs even before the plant begins operation. This may be done in a number of ways but the funds cannot be withdrawn other than for"} +{"output_text": " by using a microplate in which a plurality of wells are formed. In the assay, a sample is placed in each of the wells, and a substance to be detected is labeled with a fluorescent dye. Then, the sample is irradiated with excitation light, and fluorescence is detected. In this manner, the presence of the detection target substance is confirmed.\nIn the aforementioned fluorescence detection method, a sample is irradiated with excitation light, and fluorescence is detected. However, in the case where the sample is", "input_text": "oglucan transferase, catalase or peroxidase for improving cotton fiber characteristics); WO 01/40250 (improving cotton fiber quality by modulating transcription factor gene expression); WO 96/40924 (a cotton fiber transcriptional initiation regulatory region associated which is expressed in cotton fiber); EP0834566 (a gene which controls the fiber formation mechanism in cotton plant); WO2005/121364 (improving cotton fiber quality by modulating gene expression); WO2008/075364 (improving fiber quality, yield/biomass/vigor and/or abiotic stress tolerance of plants). 1. Field of the Invention\nThe present invention relates to a detection method, a detection apparatus, a sample cell for detection and a kit for detection to detect a substance to be detected (a detection target substance) in a sample.\n2. Description of the Related Art\nConventionally, in the field of bio-measurement and the like, a fluorescence detection method is widely used as a highly accurate and easy measurement method. In the fluorescence detection method, a sample that is presumed to contain a detection target substance that outputs fluorescence by being excited by irradiation with light having a specific wavelength is irradiated with excitation light having the specific wavelength. At this time, fluorescence is detected to confirm the presence of the detection target substance. Further, when the detection target substance is not a phosphor (fluorescent substance), a substance that has been labeled with a fluorescent dye and that specifically binds to the detection target substance is placed in contact with the sample. Then, fluorescence from the fluorescent dye is detected in a manner similar to the aforementioned method, thereby confirming the presence of the bond between the detection target substance and the substance that specifically binds to the detection target substance. In other words, presence of the detection target substance is confirmed, and this method is widely used.\nIn bio-measurement, an assay is performed, for example,"} +{"output_text": " the steam or water introduced into the dilute phase.\nThe present invention also employs one or more nozzles configured to allow the secondary oxygen-containing gas, and, optionally, a shield gas, to be introduced into the dilute phase of the regeneration reactor so as to provide combustion conditions, and control the temperature rise due to afterburn in the dilute phase.\nThe present invention also employs one or more nozzles configured to allow the secondary oxygen-containing gas, and, optionally, a shield gas,", "input_text": "a) contacting the spent catalyst with a primary oxygen-containing gas in the dense phase of the reactor, thereby combusting the coke and forming a combustion gas comprising nitrogen oxide and carbon monoxide which further react in said dense phase, thus reducing a majority of the nitrogen oxides to form elemental nitrogen, thereby forming a nitrogen-enriched combustion gas; and\n(b) contacting the nitrogen-enriched combustion gas in the dilute phase of the reactor with a secondary oxygen-containing gas, wherein the carbon monoxide is oxidized to form carbon dioxide.\nThe amount of the primary oxygen-containing gas in step (a) is adjusted so that the nitrogen-enriched combustion gas prior to step (b) comprises up to 1% carbon monoxide. As a result of this process, nitrogen oxide emissions from the regeneration reactor are significantly reduced while the temperature rise due to afterburn in the dilute phase is minimized and controlled by the introduction of a shield gas or heat removal devices.\nThe present invention also employs one or more nozzles configured to allow the secondary oxygen-containing gas, and, optionally, a shield gas, to be introduced into the dilute phase of the regeneration reactor so as to provide combustion conditions, and control the temperature rise due to afterburn in the dilute phase.\nThe secondary oxygen-containing gas introduced to the reactor oxidizes the residual CO exiting the dense phase. Steam or water may be added to the secondary oxygen-containing gas stream as a shield gas to assist in the even distribution of oxygen across the regenerator vessel and to reduce the temperature rise in the dilute phase due to the combustion of CO. The location of the one or more nozzles feeding the steam or water is selected such that there is minimal contact of steam with the majority of the catalyst, thereby avoiding catalyst deactivation. The excess heat generated in the dilute phase due to the exothermic CO oxidation may also be removed by"} +{"output_text": " of the same color, and the one or more light-emitting elements of the second pixel comprise third and fourth light-emitting elements for emitting light of the same color.\nIn some embodiments, the first pixel comprises: first and second light-emitting elements for emitting light of the same color; a first electrode disposed under the first light-emitting element; a second electrode disposed over the second light-emitting element; and a charge generation layer disposed between the first and second light-emitting elements, wherein", "input_text": " the second driving voltage.\nIn some embodiments, the first pixel comprises: first and second light-emitting elements for emitting light of the same color; a first electrode disposed under the first light-emitting element; a second electrode disposed over the second light-emitting element; and a charge generation layer disposed between the first and second light-emitting elements, wherein the first electrode, the first light-emitting element, and the charge generation layer form one of the two light emitting diodes of the first pixel, and the charge generation layer, the second light-emitting element, and the second electrode form the other one of the two light emitting diodes of the first pixel.\nSome embodiments include an OLED display comprising: a first pixel and a second pixel, wherein each of the first and second pixels comprises: a driving transistor; and one or more light-emitting elements connected to the driving transistor; wherein at least on light-emitting element of the first pixel has a shorter lifetime than at least one light-emitting element of the second pixel, and wherein the OLED display further comprises circuitry for generating data signals in response to luminance levels and supplying the data signals to the driving transistors, wherein for a given luminance level, the corresponding data signal for the driving transistor of the first pixel has a larger voltage than the corresponding data signal for the driving transistor of the second pixel.\nSome embodiments further comprise a third pixel which comprises a driving transistor and one or more light-emitting elements connected to the driving transistor, wherein the circuitry is for generating and supplying the data signals to the driving transistor of the third pixel, wherein for a given luminance level, the corresponding data signal for the driving transistor of the first pixel has a larger voltage than the corresponding data signal for the driving transistor of the third pixel.\nIn some embodiments, the one or more light-emitting elements of the first pixel comprise first and second light-emitting elements for emitting light"} +{"output_text": " is attached to the luggage. The receiver is activated by the radio frequency signal, which is received by the receiver. The receiver is activated by the radio frequency signal, which is received by the receiver. The receiver is activated by the radio frequency signal, which is received by the receiver. The receiver is activated by the radio frequency signal, which is received by the receiver. The receiver is activated by the radio frequency signal, which is received by the receiver. The receiver is activated by the radio frequency signal", "input_text": " laser beam, which can be easily detected. The tag also has sound generating capability and aids in locating the article. Such a device is nor operable for a very long time and could not readily be attached to a bag and located on an airport conveyor belt. To be operable, an exact code has to be provided by the transmitter. In addition, the object has to be viewed within a narrow angular range to observe the light emitted by the laser diode. This presents significant operational drawbacks, since other objects may completely cover the laser diode light. Similarly, the sound generated would likely be too weak to be heard in the noisy environment of an airport. In operation, the detector and sound generation, as well as the transmitter generation would consume significant power, limiting the useful life of the device. This is especially so considering the size of the tag that is described by the '832 patent.\nU.S. Pat. No. 6,147,602 to Bender discloses a carrying bag, which has a light on the outside so that the bag is visible. The lights are controlled by a timing circuit, turned on for a set period by the \u201coff to on\u201d transition of a motion responsive switch occurring outside the set period. With this arrangement, motion response is ignored if the lights are turned on. In operation, the lights are turned on by motion sensors, which activate the lights when the bag is moved. Upon being activated, the light remains in the \u201con\u201d condition for a set period of time. Alternatively, the lights may be turned on manually. No disclosure is contained in the '602 patent concerning a backpack locating device that aids in locating a bag or backpack amongst similar bags on an airport conveyor belt or carousel.\nU.S. Pat. No. 6,158,872 to Rodgers discloses a luggage locator system. A transmitter sends a coded radio frequency signal to a receiver, which"} +{"output_text": " regimen, a newly-prescribed diet, a newly-prescribed exercise regimen, a newly-prescribed weight loss regimen, a newly-prescribed smoking cessation regimen, a newly-prescribed stress reduction regimen, a newly-prescribed alcohol consumption regimen, a newly-prescribed sleep regimen, a newly-prescribed stress reduction regimen, a newly-prescribed exercise regimen, a newly-prescribed weight loss regimen, a newly-prescribed diet regimen, a newly-prescribed", "input_text": " corresponding to the second type of test strip. For example, the meter may read both blood lipid test strips and blood glucose test strips. As noted previously, the meter typically includes four romkey sockets that allow the meter to carry and read four different romkeys.\nThe meter may also prompt the user to enter diagnostic information using the user input device, such as gender, ethnicity, family history of heart disease, personal history of heart disease, personal history of diabetes, personal history of smoking, height, weight, age, blood pressure, and fitness level. The meter may then perform a diagnostic analysis and produce diagnostic results based on the test results and diagnostic information, and display diagnostic results. For example, the diagnostic results may include a medical risk index, a recommended weight loss, a five-year risk of heart attack, a ten-year risk of heart attack, a cardiac age, an extended age, and a risk of stroke.\nThe invention also provides a system for remotely producing health reports. This system includes a health monitoring device or meter, as described above, a computer station, and a health report server connected with the computer station through a network, such as the Internet. The meter writes health-related test results to a memory storage device. The computer station reads the test results from the memory storage device, establishes a network connection with the health report server, receives additional diagnostic information from a user, and transmits the test results and the additional diagnostic information to the health report server. The server, in turn, compiles a health report based on the test results and the additional diagnostic information and transmits the health report to the computer station, where the report may be printed and delivered to the patient.\nThe health report may include a trend analysis with test results compiled for a number of samples, such as total cholesterol level and blood glucose level trend reports. The additional diagnostic information may include a newly-prescribed drug"} +{"output_text": " disadvantage of silicon substrates is that they are not suitable for use in the fabrication of micro-gas chromatograph columns that are used outside of the laboratory.\nIn addition to the above-mentioned disadvantages, the fabrication of micro-gas chromatograph columns from silicon substrates is also problematic because the silicon substrate is not suitable for use in the fabrication of micro-gas chromatograph columns that are used outside of the laboratory. For example, the silicon substrate is not suitable for use in the fabrication of micro-gas", "input_text": " chromatograph that can be used outside of the laboratory, such as where the samples are collected. Portable gas chromatographs have potential application for leak detection, environmental screening, monitoring the volatile organic chemical content of waste water, and in the detection and analysis of vent gases, land fill gases, and natural gas.\nOne of the most significant barriers to making a portable gas chromatograph device is that the separation efficiency of the device is directly proportional to the length of the column. Currently, a few portable gas chromatography systems are available, but they are only suited for the detection of certain specific substances. In recent years, efforts have been made to fabricate the column and detector using newly developed micromachining techniques in order to provide miniaturized gas chromatography systems that are portable and that can analyze multiple substances.\nSuch micro-gas chromatograph devices are most commonly fabricated from silicon substrates. However, such substrates have a number of disadvantages. For example, a micro-gas chromatograph column has been fabricated by etching an interlocking spiral channel about 10 microns deep and 300 microns wide in a silicon wafer. See Reston, et al., xe2x80x9cSilicon-Micromachined Gas Chromatography System Used to Separate and Detect Ammonia and Nitrogen Dioxide,xe2x80x9d J. Microelectromechanical Systems, 3:134-146 (1994). The top surface of the column was defined by a borosilicate glass plate anodically bonded to the silicon wafer. Because the bond frequently failed along the edges, presumably because of the mismatch in thermal expansion coefficients of the two materials, the column was restricted to an area in the center of the wafer about 3.8 cm in diameter. Accordingly, the anodic bonding process used with silicon substrates serves to limit the length and, thus, the separation efficiency of the column. Another"} +{"output_text": ".\nAs described above, although the coated film described in JP-A2000-282256, JP-A2000-282267, JP-A2002-50460, and WO2003/048416 has a good film-forming property, the resin adhesiveness is insufficient. Therefore, the coated film is not suitable for use in a product requiring high resin adhesiveness such as a touch panel.\nAs described above, although the coated film described in JP-A2000-282", "input_text": " or superior to that provided by conventional chromate treatment. However, since another important effect resulting from chromate treatment, i.e., good resin-adhesiveness, has not been developed yet, urgent development has been desired.\nAs described above, with respect to a coated film formed by the chromate-free treatment method, the adhesiveness between the coated film and the resin layer formed on the coated film (resin-adhesiveness) has not been sufficiently studied and a coated film providing excellent resin adhesiveness has not been obtained yet. For example, according to an investigation by the inventors, it was found that the coated film described in JP-A2000-282256 does not provide sufficient resin adhesiveness under wet conditions. As for the coated film described in JP-A2000-282267, a strong resin layer is expected because the coated film contains a cross-linkable resin to compensate for the poor water-resistance of the hydrophilic resin in order to provide a good film-forming property. However, since it lowers the percent content of zirconium as an inorganic component which provides a barrier function, the corrosion resistance becomes insufficient. As for the coated film described in JP-A2002-50460, high resin-adhesiveness is seemingly expected because the organic resin is added to a vanadium compound. However since fingerprinting resistance is improved by the resin addition, adhesiveness to the organic material such as the resin is lowered. Although a reduced vanadium compound is normally capable of improving corrosion resistance by delocalized corrosion electrons because of its electrical conductivity, the delocalized effect is reduced when the organic resin is mixed, which leads to insufficient resin adhesiveness. As for the coated film described in WO2003/048416, although the corrosion resistance is extremely excellent, since almost no defect can be created at the formation of the coated film, the resin adhesiveness is insufficient"} +{"output_text": " capacity of the transmission channels (as is frequently the case, given the high data rate demands for video) the high peak data rates from I frames or large P or B frames result in a high frame latency.\nOf course, the above discussion only characterizes the compression algorithm latency created by large B, P or I frames in a GOP. If B frames are used, the latency will be even higher. The reason why is because before a B frame can be displayed, all of the B frames after", "input_text": " frame is subject to transmission latency, all of the frames need to be delayed by at least that latency, so the worst-case frame latency will define the latency for every video frame. The I frames introduce the longest transmission latencies since they are largest, and an entire I frame would have to be received before the I frame could be decompressed and displayed (or any interframe dependent on the I frame). Given that the channel data rate is 2 Mbps, it will take 303,935/2 Mb=145 ms to transmit an I frame.\nAn interframe video compression system as described above using a large percentage of the bandwidth of the transmission channel will be subject to long latencies due to the large size of an I frame relative to the average size of a frame. Or, to put it another way, while prior art interframe compression algorithms achieve a lower average per-frame data rate than intraframe-only compression algorithms (e.g., 2 Mbps vs. 40 Mbps), they still suffer from a high peak per-frame data rate (e.g., 303,935*60=18.2 Mbps) because of the large I frames. Bear in mind, though that the above analysis assumes that the P and B frames are all much smaller than the I frames. While this is generally true, it is not true for frames with high image complexity uncorrelated with the prior frame, high motion, or scene changes. In such situations, the P or B frames can become as large as I frames (if a P or B frame gets larger than an I frame, a sophisticated compression algorithm will typically \u201cforce\u201d an I frame and replace the P or B frame with an I frame). So, I frame-sized data rate peaks can occur at any moment in a digital video stream. Thus, with compressed video, when the average video data rate approaches data rate"} +{"output_text": " transistors (TFTs), and the like are formed, and the other is a counter substrate in which counter electrodes, color filters, and the like are formed. The TFT substrate and the counter substrate are bonded to each other with a gap of about 1 to 10 \u03bcm. A liquid crystal is filled in the gap. The TFT substrate and the counter substrate are bonded to each other with a sealant. The sealant is formed of a thermosetting resin. The sealant is cured by heating", "input_text": " remove TTR amyloid deposited in tissues\nSince almost all TTRs in blood are produced in the liver (Non-patent reference 2), the most common therapy at present is liver transplantation as classified in (1) above. Although delay in progression of the disease is observed by liver transplantation, it is inevitable to use an immunosuppressant through life with a great burden to donors and patients. Besides, deposition still continues in several organs including the eyes and the heart and thus exacerbation of symptoms in these organs can be seen in not a few cases (Non-patent reference 8). As such, it is problematic and hence development of an effective therapeutic method is earnestly desired.\nFor other therapeutic methods than liver transplantation, therapeutic methods using siRNA or an antisense oligonucleotide is at a stage of clinical development in case of the strategy (1). However, with all these methods, production of not only variant TTRs but also wild-type TTR is suppressed and thus their safety assessment when used for a long period of time should carefully be done. As for the strategy (2), a medicament has been developed that binds to the T4-binding sites of a TTR tetramer to thereby stabilize the tetrameric structure. The new medicine Vyndaqel\u00ae developed in accordance with the strategy has been approved in EU in 2011 and in Japan in 2013. As the result of clinical test for as long as 30 months, Vyndaqel\u00ae exhibited the effect to delay peripheral neuropathy in FAP patients but failed to suppress completely the progress of symptoms (Non-patent reference 9). Also for the strategies (3) and (4), although plural kinds of medicaments are at a stage of clinical development, the status quo is that none of the therapies can be a radical treatment. A liquid crystal display device has two substrates. One is a TFT substrate in which pixel electrodes, thin film"} +{"output_text": " other than oxygen. The first cell and the second cell are connected in series. The heater includes a first heater element connected to the first cell and a second heater element connected to the second cell. The heater control circuit supplies power to the heater in a pulse-width modulated manner. The correction circuit monitors values of the gas concentration signal in the power supply-on duration and the power supply-off duration, averages the values of the gas concentration signal in the power supply-on duration and the power supply", "input_text": " signal as a function of concentration of a specified component of gasses to be measured, a heater heating the sensor element, and an insulator disposed between the sensor element and the heater; (b) a heater control circuit controlling a supply of power to the heater of the gas concentration sensor in pulse-width modulation; and (c) a circuit detecting the gas concentration signal either for a power supply-on duration in which the power is supplied to the heater through the heater control circuit or for a power supply-off duration in which the supply of power to the heater is cut.\nAccording to the fifth aspect of the invention, there is provided a gas concentration measuring apparatus which comprises: (a) a gas concentration sensor including a sensor element producing a gas concentration signal indicative of the concentration of a preselected component contained in gasses, a heater heating the sensor element, and an insulator disposed between the sensor element and the heater; (b) a heater control circuit supplying power to the heater cyclically using a pulse-width modulated (PWM) signal; and (c) a correction circuit monitoring values of the gas concentration signal in a power supply-on duration for which the power is supplied to the heater and a power supply-off duration for which supply of the power to the heater is cut off, the correction circuit corrects the gas concentration signal using the monitored values.\nIn the preferred mode of the invention, the correction circuit averages the values of the gas concentration signal in the power supply-on duration and the power supply-off duration and corrects the gas concentration signal using an averaged value.\nThe sensor element includes a first cell which is responsive to application of voltage to discharge oxygen contained in the gasses and produces a current signal as a function of a concentration of the oxygen and a second cell which is responsive to application of voltage to produce a current signal as a function of a concentration of a given gas"} +{"output_text": " the BS transmits a pilot signal to the MSs, and the MSs measure the channel situations between the BS and the MSs using the pilot signal. The BS transmits data to the MSs using the channel situations measured by the MSs.\nIn the wireless communication system, the BS transmits the pilot signal to the MSs, and the MSs measure the channel situations between the BS and the MSs using the pilot signal. The BS transmits data to the MSs using the channel situations measured", "input_text": " used to lock into the spring wire.\nMore than one reversible item of jewelry can be attached by conventional means to one another. This orientation creates a single item of jewelry with multiple reversible ornamental members. Other, non-reversible, ornamental members can also be attached by conventional means to the reversible item of jewelry.\nIt is an object of this invention to provide jewelry with ornamental members that can be reversed relative to their frame.\nIt is another object of this invention to provide reversible jewelry that contains two mating insert pillows.\nIt is another object of this invention to provide reversible jewelry that can be flipped easily to expose the opposite face.\nIt is another object of this invention to provide reversible jewelry that can be locked into place within its frame. Field of the Invention\nThe present invention generally relates to an apparatus and method for feeding back channel quality information and performing scheduling using the fed-back channel quality information in a wireless communication system. More particularly, the present invention relates to an apparatus and method for feeding back channel quality information and performing scheduling using the fed-back channel quality information in a wireless communication system based on Orthogonal Frequency Division Multiple Access (OFDMA).\nDescription of the Related Art\nConventionally, a wireless communication system performs communication through a radio channel between mobile stations (MSs) or between an MS and a base station (BS) of a predetermined network. The wireless communication system was initially developed to provide a voice service, but has been advanced to provide a data service in response to user requests. Technologies are needed which can efficiently transmit data due to an increase in an amount of data to be transmitted and an increase in the number of users. According to this need, wireless communication systems transmit user-by-user data by correctly detecting channel situations between the BS and the MSs.\nIn a method for detecting the channel situations between the BS and the MSs,"} +{"output_text": " deformed by the operating part is reduced, and consequently, the operational properties of the pointing device are deteriorated.\nIn addition, in the case where the elastic member is formed by a rubber or a coil spring, the elastic member is required to have a certain degree of thickness in order to ensure a spring force. Therefore, the dimension in a vertical direction of the elastic member is inevitably increased, and consequently, the height of the pointing device is increased.\nIn the case where the elastic member is formed", "input_text": " it is required to facilitate the reduction of thickness of the data processor, the pointing device is also required in itself to be reduced in the height thereof, i.e., a distance between the bottom surface of the base part and the top surface of the operating part should be decreased as much as possible.\nIncidentally, the built-in type pointing device as described above essentially imposes an input operation within a limited area on a operator, and besides, there is a possibility of further deteriorating the operational properties when the height reduction is facilitated. Therefore, in the case where the operator regards the good operational properties of the pointing device as important when using the data processor with a reduced thickness, a detachable type pointing device, in which the dimensions are not related to the dimensions of the data processor, is used. In particular, when the data processor is used in a portable way, it is difficult to ensure a space for operating a mouse having a normal structure, in which an operative ball is partially exposed on the back side of the mouse. Consequently, it has been proposed that a pointing device having a structure similar to that of the conventional built-in type pointing device is used while being detachably attached to the housing body of the data processor (see, e.g., Japanese Unexamined Patent Publication (Kokai) No. 6-139013 (JP-A-6-139013)).\nWhen it is required to reduce the height of the conventional pointing device having the magneto-electro transducer, it is also required to decrease the dimension of the components of the pointing device in a vertical direction, corresponding to height in the assembled state thereof. However, if the dimension in a vertical direction of the elastic member, generally formed by a rubber or a coil spring for biasing the operating part toward the initial position, is excessively reduced, it is possible that a spring force generated when the elastic member is"} +{"output_text": "-xcex3-butyrolactone (I) is obtained.\nThe process for producing optically active 3-hydroxy-xcex3-butyrolactone according to this invention is conducted according to the following reaction scheme: \nwherein R1 represents a C1-4 lower alkyl group, R2 represents a protective group for a hydroxyl group deprotected by hydrogenation with a heterogeneous hydrogenation catalyst, and the symbol * means an asymmetric carbon atom.\nThat is", "input_text": "rically hydrogenating a 4-substituted oxy-3-oxobutyrate represented by the general formula III: \nwherein R1 and R2 have the same meanings as defined above, in the presence of a ruthenium complex comprising an optically active phosphine compound as a ligand.\nFurther, this invention relates to the process for producing optically active 3-hydroxy-xcex3-butyrolactone, wherein R 2is an optionally substituted benzyl group, more preferably a benzyl group.\nFurther, this invention relates to the process for producing optically active 3-hydroxy-xcex3-butyrolactone, wherein the metal catalyst is a heterogeneous catalyst of palladium, iridium, rhodium, ruthenium, nickel, osmium or platinum.\nFurther, this invention relates to the process for producing optically active 3-hydroxy-xcex3-butyrolactone, wherein the acidic substance is p-toluenesulfonic acid, methanesulfonic acid, camphor sulfonic acid, sulfuric acid, trifluoroacetic acid, ferric chloride, zinc chloride or stannic chloride.\nHereinafter, this reaction is described in more detail.\nThe process for producing optically active 3-hydroxy-xcex3-butyrolactone according to this invention is conducted according to the following reaction scheme: \nwherein R1 represents a C1-4 lower alkyl group, R2 represents a protective group for a hydroxyl group deprotected by hydrogenation with a heterogeneous hydrogenation catalyst, and the symbol * means an asymmetric carbon atom.\nThat is, the optically active 4-substituted oxy-3-hydroxybutyrate (II) is hydrogenated in the presence of a heterogeneous hydrogenation catalyst and an acidic substance followed by deprotection and simultaneous ring closure thereof, whereby optically active 3-hydroxy"} +{"output_text": " typically mounted on the floor of the vehicle and are not adjustable to accommodate children of different sizes.\nIt is therefore an object of the present invention to provide a three point seatbelt system for restraining a child in a vehicle seat that is adjustable to accommodate children of different sizes.\nIt is another object of the present invention to provide a three point seatbelt system for restraining a child in a vehicle seat that is adjustable to accommodate children of different sizes without requiring the use of a booster seat.\nIt", "input_text": " a first side of the seat location and a guide through which the belt passes. The guide is typically a fixedly mounted, pivotable plate having a slot through which the belt passes or a fixedly mounted idler roller the belt traverses. The belt has first and second ends respectively connected to (1) an anchor fixedly mounted below the guide, and (2) a retractor fixedly mounted below the guide. In use, the belt is wound and paid from the retractor. The anchor and retractor are mounted at or in proximity to the same, second side of the seat location that is opposite to the first side of the seat location. The restraining system also includes a tongue slidably connected to the belt between the guide and anchor for connection to the buckle. The guide is mounted at or in proximity to the second side of the seat location at a height above the horizontal base of the seat location. In many systems, the height of the guide above the horizontal base of the seat location is adjustable to enable a typical adult sized occupant of the seat location to be safely and comfortably restrained on the seat base at the seat location while the tongue is connected to the buckle. The belt, when in use with the tongue in the buckle, includes (1) a first segment between the retractor and the guide, (2) a second segment between the guide and the tongue for restraining the shoulder and chest of the adult occupant, and (3) a third segment between the tongue and the anchor arrangement for restraining the lap of the adult sized occupant.\nThree point seatbelt systems of the above type, however, have not proven satisfactory for vehicle occupants who are not adult sized. It is mandated in most states of the United States that children under a certain size must sit in a booster seat that is held in place by three point seat belt systems of the above type. Such booster seat arrangements are"} +{"output_text": " machine, the sanitary standard of the food machine is described in the \u201cFood Sanitation Law\u201d (Food Sanitation Law, Ministry of Health, Labor and Welfare, Japan). The food machine is required to be free from the following materials harmful to the human body:\n(1) materials containing a heavy metal such as lead, mercury, cadmium, and arsenic;\n(2) materials containing a poisonous substance such as a pesticide, a pesticide residue, and a pesticide chemical;\n", "input_text": " polyamide-imide, which is impregnated with the fluorinated oil (patent document 6) are known. The bearing in which the retainer consisting of the porous article is impregnated with the alkylated cyclopentane oil serving as the lubricating oil. (patent document 7) is also disclosed.\nBut in the rolling bearings of the patent documents 1 and 6, because the porous retainer is impregnated with the fluorinated oil serving as the lubricating oil, a large centrifugal force is applied to the retainer during a rotation of the bearing. Consequently the rotational efficiency of the bearing deteriorates, and the torque fluctuates to a high extent. These rolling bearings are not sufficiently reliable in the durability thereof when they are used at a high surface pressure (about 2 GPa).\nIn the bearing of the patent document 7, the above-described problem which occurs owing to the use of the fluorinated oil for the retainer is solved. But the interconnected hole porosity of the retainer is 5 to 250. Therefore the amount of the lubricating oil with which the porous article of the retainer can be impregnated is small and hence it is impossible to prolong the period of time in which the bearing can be used.\nThe rolling bearing can be used for a food machine used to mix, knead, heat, dry, cool, charge, pack, and store food materials and edible products (or semi-products). As in the case of other machines, the bearing and other sliding parts are mounted on the food machine. It is necessary to prevent ingredients of these parts harmful to the human body from flowing into food. Therefore in accordance with the legal sanitary standard, it is necessary to choose materials such as resin, metal, lubricating oil, and grease, and additives composing the parts.\nAs the legal sanitary standard regarding various materials of the parts of the food"} +{"output_text": " particularly to a developing device and an image forming apparatus that can prevent a developer from being scattered.\n2. Description of the Related Art\nIn an image forming apparatus such as a copying machine, a printer, a facsimile machine, or a multifunction peripheral (MFP) having functions of these devices, an electrostatic latent image formed on a surface of a photoconductor is developed by a developing device to form a toner image. The toner image is transferred onto a recording medium such as a", "input_text": " radiation source. The operation of the reference radiation source at large monitoring intervals for only a short duration is necessary so that the aging of the reference radiation source is disregarded.\nIn these types of gas sensor arrangements, the settling time or period until usable measurement results are available is also relatively long after having been switched-off. For example, FIG. 7 shows a conventional pulse sequence for a reference radiation source and a measuring radiation source. Curve 701 denotes the pulse sequence for the measuring radiation source and curve 702 denotes the pulse sequence for the reference radiation source. In principle, the reference radiation source is only switched-on for referencing. In FIG. 7, the reference radiation source is switched-on at time t=tr. When the reference radiation source is first switched-on, it requires a certain amount of time until thermal equilibrium is established and reliable measurement values can be delivered.\nAt time t=tm, the measuring radiation source is again switched-on to continue measurement and at the same time to detect comparative values for correction. Because the measuring radiation source was switched-off during the time in which the reference radiation source was pulsed, the settled state for the measuring radiation source must also now be re-established.\nIt is commonly assumed that approximately four measurement values are required to reach the settled state and approximately four measurement values are required for referencing for a total of eight pulses. The referencing therefore lasts for 16 measuring cycles with unaltered pulse sequences being emitted. If approximately three seconds are calculated, for example, per measuring cycle, the entire referencing process lasts at least 48 seconds. There is therefore a total of 48 seconds during which no measurement data which can be utilized in a warning system. As a result, a considerable amount of gas could escape unnoticed through a gas leak during the referencing phase. 1. Field of the Invention\nThe present invention relates to developing devices and image forming apparatuses, and more"} +{"output_text": " region.\nU.S. Pat. No. 4,976,733 (Shook) discloses a toposcopic catheter and method of fabrication. In particular, the everting catheter assembly includes a primary catheter tube inside of which an everting or toposcopic element is secured spaced from the distal end of the primary catheter tube and with the open head end of the toposcopic element facing away from the open distal end of the primary catheter. The bonding of the", "input_text": ".S. Pat. No. 4,604,094 (Shook) discloses a toposcopic catheter and method of fabrication. In particular, the everting catheter assembly includes a primary catheter tube inside of which an everting or toposcopic element is secured spaced from the distal end of the primary catheter tube and with the open head end of the toposcopic element facing away from the open distal end of the primary catheter. The bonding of the toposcopic element to the interior of the primary catheter is effected with the use of a hollow mandrel through which the toposcopic element extends so that a short length of the toposcopic element projects out from the mandrel and is folded back over a length of the outer surface of the mandrel at the mandrel forward end. The mandrel is withdrawn into the primary catheter tube to the desired location for bonding at which point a bond is effected by means of heat conducted to the mating surfaces through the catheter wall or deposited locally by a radiative technique and heating between the mating surfaces. The tail end of the everting catheter is secured into a seal tube of greater rigidity by disposing the open tail end of the toposcopic element over a bias-cut end of the seal tube. Bonding is effected by application of heat to the mating surfaces. The bias-cut on the seal tube permits complete collapse of the toposcopic element upon pressurization of the annular cylindrical region from which the eversion is effected. Since the protruding bias-cut does not preferentially bend so as to evert within itself, a geometrically unbalanced collapse of the toposcopic element occurs. Introduction of pressurized sterile eversion fluid media into the annular region is facilitated by means of a bleed tube extending to the lead end of the annular region so as to permit pre-existing gas to be bled from the annular"} +{"output_text": ".\nU.S. Pat. No. 4,921,769 discloses a direct write photothermal litho plate that is sensitive to infrared radiation. The plate is prepared by coating a substrate with a radiation sensitive composition comprising a radiation sensitive polymer, a radiation sensitive monomer, and a radiation sensitive initiator. The radiation sensitive composition is then exposed to radiation to form a radiation sensitive coating on the substrate. The radiation sensitive coating is then developed to remove the radiation sensitive monomer and radiation sensitive initi", "input_text": " is referred to as positive-working. Conversely, when that portion of the coating which is exposed becomes hardened, the plate is referred to as negative-working. In both instances the image area remaining is ink-receptive or oleophilic and the non-image area or background is water-receptive or hydrophilic. The differentiation between image and non-image areas is made in the exposure process where a film is applied to the plate with a vacuum to insure good contact. The plate is then exposed to a light source, a portion of which is composed of UV radiation. In the instance where a positive plate is used, the area on the film that corresponds to the image on the plate is opaque so that no light will strike the plate, whereas the area on the film that corresponds to the non-image area is clear and permits the transmission of light to the coating which then becomes more soluble and is removed. In the case of a negative plate the converse is true. The area on the film corresponding to the image area is clear while the non-image area is opaque. The coating under the clear area of film is hardened by the action of light while the area not struck by light is removed. The light-hardened surface of a negative plate is therefore oleophilic and will accept ink while the non-image area which has had the coating removed through the action of a developer is desensitized and is therefore hydrophilic.\nDirect write photothermal litho plates are known such as the Kodak Direct Image Thermal Printing Plate. However, they require wet processing in alkaline solutions. It would be desirable to have a direct write photothermal litho plate that did not require any processing.\nThe prior art has tried to produce such plates by a variety of means. All of them fall short of a plate that has high writing sensitivity, high image quality, short roll up, and long run length without any processing"} +{"output_text": ", and inhibits the binding of Notch receptor molecule with its ligand, a Delta-1 or Serrate-1 molecule. This suggests that the polypeptide having amino acid sequence of the sequence listing, SEQ ID NO: 1, 2, 4 or 5, shows colony formation stimulation action by controlling the concentration of its action.\nThis fact suggests that inhibition of binding the polypeptide having amino acid sequence in the sequence listing, SEQ ID NO: 2-7 and these receptors can be used for finding", "input_text": " ID NO: 2-7 and undifferentiated cells.\nExpressed cell may be COS-7 cell as shown in Examples, but cells of human origin are preferable, and further expressed cells may be cell line or any of human in vivo blood cells and somatic cells. Consequently, the polypeptide can be expressed in vivo by integrated into vectors for gene therapy.\nAs shown in Example 10, FLAG chimera protein of human Delta-1 or human Serrate-1, both of which are low concentrated monomer, shows not a colony formation suppressive action but a colony formation stimulating action. This action may be involved in expressing Notch receptor and Notch ligand in the occasion of cell division of blood undifferentiated cells and acting the polypeptide of the present invention as an antagonist for that action. This suggests that the polypeptide having amino acid sequence of the sequence listing, SEQ ID NO: 1, 2, 4 or 5, shows colony formation stimulation action by controlling the concentration of its action.\nThis fact suggests that inhibition of binding the polypeptide having amino acid sequence in the sequence listing, SEQ ID NO: 2-7 and these receptors can be used for finding out molecules and compounds for stimulating cell differentiation. The methods include binding experiment using radio isotope, luciferase assay using transcriptional control factors, a down stream molecule of the Notch receptor, and simulation on the computer by X-ray structural analysis. Accordingly, the present invention includes screening method for pharmaceuticals using polypeptide in the sequence listing, SEQ ID NO: 2-7.\nAs shown in Example 13, specific leukemia cells can be differentiated by using IgG chimera protein of human Delta-1 or human Serrate-1. Consequently, the present invention can be applied for diagnostic reagents for leukemia or isolation of specific blood cells. This result indicates that human Delta-1 or human Serrate-1 molecule binds specifically with its receptor, a Notch receptor molecule"} +{"output_text": " includes U.S. Pat. No. 5,979,905; U.S. Pat. No. 5,979,904; U.S. Pat. No. 5,979,905; U.S. Pat. No. 5,979,904; U.S. Pat. No. 5,979,905; and U.S. Pat. No. 5,979,904.\nWhile", "input_text": " in order to separate for example certain components of the latter, for example blood plasma or blood platelets. This separation process can be performed, in a preferred form of implementation of the invention, by centrifuging.\nIn a further form of implementation, the invention concerns a method of producing a syringe, suitable for in vitro induction of interleukin 1 receptor antagonists, in which an inductor, preferably an immunoglobulin, in particular immunoglobulin G, is placed in the syringe with a protein-binding inner structure and incubated so that the inductor, in particular the immunoglobulin G, binds to the inner structure.\nIt is self-evident that the invention also concerns the syringe produced in this way, which is manufactured in a particularly preferred form of implementation from polystyrene, polypropylene or glass, the syringe being distinguished by a coating of its inner structure with an inductor, in particular with an immunoglobulin, preferably with immunoglobulin G.\nThe invention also relates to the use of immunoglobulin, in particular immunoglobulin G, for coating the inner structures of syringes, preferably made of polystyrene, polypropylene or glass, for the in-vitro induction of therapeutically-effective proteins, preferably interleukin 1 receptor antagonists.\nAdditional advantageous forms of the invention emerge from the sub-claims.\nThe invention is explained in more detail with reference to figures and examples of implementation. 1. Field of the Invention\nThe present invention relates to jacks and more particularly pertains to a new dual hydraulic jack system for more quickly raising the jack up to a load to be lifted.\n2. Description of the Prior Art\nThe use of jacks is known in the prior art. More specifically, jacks heretofore devised and utilized are known to consist basically of familiar, expected and obvious structural configurations, notwithstanding the myriad of designs encompassed by the crowded prior art which have been developed for the fulfillment of countless objectives and requirements.\nKnown prior art"} +{"output_text": ", 015001 (2009).\nIn addition, the inflection point is not necessarily the point where the second derivative is zero. In fact, the inflection point is the point where the second derivative is maximum. Thus, the inflection point is not necessarily the point where the second derivative is zero.\nIn view of the foregoing, there is a need for a method and apparatus for determining plasma potential \u03c6p that does not require the calculation of the second derivative of Ip(Vp", "input_text": " an unequivocal value for the plasma potential \u03c6p. Id.\nThus, conventional methods of finding plasma potential \u03c6p using a Langmuir probe require taking a second derivative of Ip(Vp) and determining the inflection point of Ip(Vp), i.e., the point where\n \u2146 2 \u2062 I p \u2146 V p 2 = 0.\nHowever, Langmuir probes are susceptible to contamination, and in many cases calculating the second derivative often is severely affected by noise and so introduces errors in the values of \u03c6p.\nConsequently, to avoid having to calculate the second derivative, many researchers resort to fitting routines of various forms, based in part on the probe geometry, to determine the inflection point, i.e., the point where\n \u2146 2 \u2062 I p \u2146 V p 2 = 0.See, e.g., J. J. Carroll, et. al., \u201cA segmented disk electrode to produce and control parallel and transverse particle drifts in a cylindrical plasma,\u201d Rev. Sci. Instrum., 65(9), 2991 (1994). These fitting routines also have been used to avoid errors introduced by probe contamination, but by their nature are only approximate and most often assume a Maxwellian distribution. Since the fit itself treats a complete curve, a fit to one area of the curve (such as the electron saturation region) influences the entire curve fit and therefore the determination of plasma parameters. Also fitting routines should be based on physical reasoning and not on the assumption of prevailing geometry (i.e., algebraic fits) as is often the case. R. F. Fernsler, \u201cModeling Langmuir probes in multi-component plasmas,\u201d Plasma Sources Sci. Technol. 18"} +{"output_text": " used a voltage comparator to compare the input voltage to a reference voltage. If the input voltage was below the reference voltage, the brownout detection circuit would activate the emergency lighting system. Other brownout detection circuits used a current comparator to compare the input current to a reference current. If the input current was below the reference current, the brownout detection circuit would activate the emergency lighting system.\nBrownout detection circuits using voltage comparators are relatively simple to implement. However, they are not very accurate.", "input_text": " voltage ratio. As described above, the voltage ratio determines the voltage transformation that takes place. There are times when the actual incoming voltage is different than the expected normal incoming voltage. When this happens, it may be advantageous to be able to change the voltage ratio in order to get the desired (rated) output voltage. Voltage taps, designed into the transformer's primary, deliver this desired flexibility. In other words, tapping the primary in a number of different spots provides a means to adjust the turns ratio and fine-tune the secondary output voltage. These tap connections are usually set at the factory for normal line voltages. During installation, the appropriate tap may be selected depending on the input voltage present at the installation site.\nIn emergency lighting systems, transformers have been used to step down an input voltage to a lower voltage, which is then used to power the charger circuitry. Because the transformer could have multiple input voltage taps, the transformer could accept input voltages of various magnitudes allowing the emergency lighting system to be used in different voltage environments. For example, one common method has been to utilize a 60 Hz line rated transformer with taps for 120 and 277 VAC. During installation the electrician could select the appropriate tap for the voltage level at the site.\nCapacitive divider circuits are also used to step down an input voltage to power the charger circuitry in emergency lighting systems. Like transformers, capacitive divider circuits can also have taps, which allow the use of emergency lighting systems using these circuits in different voltage environments. For example, a capacitive divider circuit with taps for 120 and 277 VAC could be used in an emergency lighting system.\nThe use of a transformer or capacitive divider circuit in past emergency lighting systems allowed for relatively simple brownout detection circuitry in those systems. There were two main categories of brownout detection circuits used in these systems. Some brownout circuits"} +{"output_text": " applied to the drill string. The cement must also be sufficiently strong to withstand the pressure of the formation fluids.\nThe cement slurry is typically prepared by mixing Portland cement, water, and additives such as accelerators, retarders, and/or set retarding agents. The cement slurry is then pumped down the drill string and into the annulus between the borehole and drill string. The cement slurry is displaced out the bottom of the drill string into the annulus by pumping a displacing fluid down the drill", "input_text": " of Portland cement.\nSecond, contamination of Portland cement with the drilling fluid is highly probable due to the process used to place the cement plug in the borehole. A successive displacement process is used wherein the cement slurry is pumped down the drill string (or similar work string with an inner diameter substantially smaller than the diameter of the borehole). The slurry is then displaced out the bottom of the drill string into the annulus between the borehole and drill string by pumping a second (displacing) fluid down the drill string. This displacing fluid is typically drilling fluid.\nThe borehole is filled with drilling fluid and the cement slurry exiting the bottom of the drill string is traveling at a higher velocity than the fluid moving up the much larger annular space. Thus the cement slurry is \"jetted\" into the drilling fluid as its flow direction changes 180 degrees. Any chemical incompatibility between the drilling fluid and cement slurry may produce a gelled mass that inhibits effective displacement of the drilling fluid by the Portland cement slurry which can result in contamination of the entire cement volume.\nSpacers are often used ahead of the Portland cement slurry to prevent contamination of the cement with the drilling fluid. These spacers are similar in composition to the drilling fluid. Clay and/or polymeric thickeners are used to viscosify the base fluid (usually water) in order to suspend weighting agents such as barites, hematites, or ilmenite. Emulsions of oil and water may also be used to provide viscosity for solids suspension. Often, surfactants and/or other solvents may be incorporated into the spacer to improve compatibility with the drilling fluid. Although more chemically compatible with the cement, spacer contamination of the cement slurry can occur and the effect on cement compressive strength is often similar to the effect of drilling fluid contamination.\nThird, the cement must adhere to the borehole walls to prevent downward movement of the plug when weight is"} +{"output_text": " level suitable for use in a magnetic resonance imaging system, the thickness of the ferroelectric material would have to be reduced to less than 1 xcexcm. However, the use of such a thin ferroelectric material would result in a structure which is not suitable for use in a magnetic resonance imaging system.\nIt is an object of the present invention to provide a microstructured material which has a high resonant frequency and which is suitable for use in a magnetic resonance imaging system.\nAccording to a first aspect of", "input_text": " xe2x80x83 \u2062 2 \u2062 c 1 d 1 ] ] Eq . xe2x80x83 \u2062 2 \nwhere r1 is the inside radius of the inner ring 28, a the lattice spacing of the rings, l the separation between the rings in a given column in an axial direction, d1 the separation between the rings in a radial direction, c1 the width of each ring in a radial direction and \"sgr\"1 the resistance per unit length of each ring.\nA further microstructured material described in United Kingdom Patent Application No. 2346485 (International Patent Application No. WO 00/41270) is constructed using a stack of conducting elements which comprise a single spiral shaped conductor 34 as illustrated in FIGS. 3(a) and 3(b).\nIt is also suggested that in United Kingdom Patent Application No. 2346485 (International Patent Application No. WO 00/41270) that the magnetic permeability of the structured material could he made to be switchable by incorporating an non-linear dielectric medium, such as Barium Strontium Titanate (BST) or other ferroelectric material, into the structure. The magnetic permeability of the structure is switched by changing the permittivity of the ferroelectric material by applying an electric field across the ferroelectric material. It is suggested that the ferroelectric material could be incorporated between the cylindrical tubes of each capacitive element (FIG. 1(b)) or between each of the concentric rings in a radial direction (FIG. 2(a)). The inclusion however of a ferroelectric material, such as BST, decreases the resonant frequency of the structure by a factor of more than 30 times. To increase the resonant frequency to a"} +{"output_text": " solution. The purified solution is then concentrated to a desired concentration and the concentrated solution is heated to a temperature of about 100.degree. C. to about 200.degree. C. to obtain a fused silica product. The fused silica product is then washed with water to remove the alkali metal ions and other impurities. The fused silica product is then washed with an acid to remove the OH-type anion exchange resin. The fused silica product is then washed with water to remove the acid. The fused silica", "input_text": " chelating agent directly to the impure fluosilicate acid solution. Purportedly, the chelating agent improves the purity of the first silica precipitate by sequestering or chelating multivalent metal ions in the solution before ammoniation. Ion exchange also has been used for the same purpose. However, these techniques tend to introduce other impurities, such as alkali metal ions, into the precipitated silica. Additionally, these prior art purification processes rely upon cationic exchangers and metal chelating agents and thus cannot satisfactorily remove the phosphorus and sulphur impurities generally present as anionic species (SO.sub.4.sup.-2 and PO.sub.4.sup.-3) in the fluosilicic acid by-product solutions typically recovered from the acidulation of phosphate rock. Nor can anionic exchange agents be used because the anionic exchange agents significantly decrease the recovery of silica.\nSilica produced in accordance with these methods is not satisfactory for use in producing high purity, transparent, bubble-free particles because the silica product contains too many impurities. With respect to silica produced by ammoniating ammonium fluosilicate, the subsequently fused particles are not transparent and bubble-free. Methods known in the art for producing fusible silica are complex and difficult to carry out. One alternative, natural quartz, is very expensive and reserves are limited. Further, natural quartz typically is not acceptable for high purity fused product unless it is purified.\nJapanese Patent 85(60)/42218 teaches a method of producing high purity silica suitable for electronic uses, for use as a filter for plastic resin, for use in adhesives, and the like. An aqueous solution of an alkali silicate is ultrafiltered to remove colloidal-sized particles. The filtered solution then is purified first with an acidic cation exchange resin, and then with an OH-type anion exchange resin, to obtain a purified"} +{"output_text": ") to be damped.\nThe object of the present invention is to provide an optical arrangement for the illumination of specimens for confocal scanning microscopes, which is of simple construction and which can be produced in a cost-effective manner.\nThis object is achieved by an optical arrangement for the illumination of specimens for confocal scanning microscopes, having an illuminating beam path and at least one light source, wherein the light source is a laser diode.\nThe optical arrangement according to the invention has the advantage that", "input_text": " tips abut on the stop, they cannot be thrust onto the spigots any farther. The clamping force of the pipette tips is limited by this. The springs are dimensioned such, and preloaded if need be, that the pipette tips abut on the stop accurately then when they sit on the spigots with the desired clamping force. The clamping force is determined such that the pipette tips sit and seal securely on the spigots.\nThe known pipette avoids high clamping forces, which would hamper the ejection of the pipette tips. However, the clamping force which is necessary for a safe seat and the sealing of the pipette tips on the spigots must be overcome in the ejection process. The overall ejection force to be applied is high, because several pipette tips must be squeezed off at the same time. This invention claims priority of the German patent application 100 44 636.1 which is incorporated by reference herein.\nThe present invention concerns an optical arrangement for the illumination of specimens for confocal scanning microscopes, having an illuminating beam path and at least one light source.\nOptical arrangements of the generic type have been known from practical use for some time; merely by way of example, the reader is referred to EP 0 495 930, which discloses an optical arrangement for the illumination of specimens for confocal scanning microscopes in which fluorescent specimens can be excited to fluoresce in the confocal scanning microscope with a single laser that exhibits multiple emission wavelengths. Concretely, this involves an argon-krypton laser.\nAlso known, from U.S. Pat. No. 5,161,053, is a confocal microscope in which light of an external light source is transported to the confocal scanning microscope with the aid of a glass fiber. It is thereby possible, in particular, for the vibrations induced by the laser light source (principally from cooling systems"} +{"output_text": "length portion. The sheath introducer is then withdrawn to release the extension graft from the insertion catheter, and the insertion catheter is withdrawn to leave the extension graft in place. The extension graft may then be coupled to the contralateral limb of the bifurcated graft by advancing the mating portion of the extension graft into the contralateral limb until the distal end of the mating portion is at the desired location, and then inflating the mating portion to conform the mating portion to the interior surface of the contralateral limb. The", "input_text": " of the contralateral limb held by the retainer ring, partially reinflating the tip balloon to hold the distal end and associated distal spring portion of the contralateral limb by friction, advancing the insertion catheter into the contralateral iliac vessel until the distal end of the contralateral limb is at the desired location, and finally reinflating the tip balloon fully to expand or break the retainer ring and release the spring portion. The deployment means may then be withdrawn and removed from the entry site and the entry site attended using standard procedure.\nIf the extent of disease indicates that a longer graft limb is necessary in either or both iliac vessels, an adjustable length extension graft may be coaxially coupled to a lateral limb, for instance the contralateral limb, of the bifurcated graft by the following procedure.\nThe extension graft is deployed via percutaneous entry through the contralateral femoral artery. A guide wire is directed through the contralateral limb and up into the primary limb of the bifurcated graft, and deployment means carrying a pre-loaded extension graft is directed over the guidewire to position the mating portion of the extension graft partially within the contralateral limb of the bifurcated graft such that a first spring portion at the proximal end of the mating portion is overlapped by the spring portion at the distal end of the contralateral limb. The sheath introducer may then be withdrawn while the push rod is held stationary to deploy the first spring portion, the insertion catheter moved upwards to locate the graft balloon within the first spring portion, and the graft balloon inflated to conform the first spring portion to the interior surface of the contralateral limb. Contrast media is injected through the first inner track of the insertion catheter to verify that the coupled graft limbs are not leaking. Next, the sheath introducer is further withdrawn to release a second spring portion defining a junction between the mating and adjustable-length portions, and a third spring portion at a distal end of the adjustable-"} +{"output_text": " current of the semiconductor laser; a switch means for selectively switching between the output of the error amplifier and the output of the D/A converter, to make a control signal of the current source; and an A/D converter for detecting a current control voltage at recording, and a digital value of the D/A converter is decided on the basis of a digital value of the A/D converter, and at the recording, the output of the error amplifier is switched over to the output of the D", "input_text": " as a reference can be obtained.\nAccording to Claim 4 of the present invention, there is provided a laser control apparatus which controls a power of a semiconductor laser in an optical recording/reproducing apparatus for performing recording/reproduction into/from an optical disk by a semiconductor laser, comprising: a photodiode for detecting a light of the semiconductor laser; a current/voltage conversion circuit for converting a current of the photodiode to a voltage, and outputting the voltage; a reference voltage source for deciding a reproduction power of the semiconductor laser; an error amplifier for amplifying a difference between the voltage outputted by the current/voltage conversion circuit and the reference voltage; a current source for passing a current through the semiconductor laser; a reproduction power control system for connecting an output of the error amplifier to the current source, to control the reproduction power; a D/A conversion circuit for deciding a current to be passed through the semiconductor laser; a switch means for selectively switching between an output of the error amplifier and an output of the D/A conversion circuit, to make a control signal of the current source; and an A/D conversion circuit for selecting one of an output voltage of the error amplifier and an output voltage of the D/A conversion circuit, and subjecting the voltages to digital conversion, in which a digital value of the D/A conversion circuit is decided on the basis of a digital value of the A/D conversion circuit, and at the recording, the output of the error amplifier is switched over to the output of the D/A conversion circuit and the current of the current source is controlled.\nAccording to Claim 4 of the present invention, the laser control apparatus comprises: a reproduction power APC system for controlling the reproduction power to make the power constant at the reproduction; an A/D converter for detecting a current control voltage at reproduction; a D/A converter for controlling a bias"} +{"output_text": "shaped resilient member is disposed between the pipeline sections and a pressure responsive element is disposed in the U-shaped resilient member. The resilient member is compressed by the pressure of the process fluid flowing in the pipeline sections and the pressure responsive element senses the pressure of the process fluid. The resilient member is compressed by the pressure of the process fluid flowing in the pipeline sections and the pressure responsive element senses the pressure of the process fluid. The resilient member is compressed by the pressure of the process fluid flowing in the pipeline", "input_text": " large and it becomes difficult to cool the resistor when the breakdown voltage becomes high.\nFurther, there is a problem that a clamp type snubber circuit of small loss for series connection is not present. Also, there occurs a possibility that surge voltages generated at the time of interruption of the current flowing in the series-connected IGBTs will not be uniformly generated in the respective IGBTs and an excessive voltage is applied to one of the IGBTs to destroy the IGBT.\nIf the main IGBTs are connected in series or in parallel, the voltages or currents thereof occurring at the switching time become unbalance depending on the characteristics of the main IGBTs. In order to eliminate the unbalanced state, it is required to set the characteristics of the main IGBTs equal. This lowers the yield of the IGBTs and raises the cost thereof.\nFurther, if the voltages become unbalance at the time of switching of the main IGBT, it becomes difficult to standardize the main IGBT and the peripheral circuit thereof. In this case, it becomes necessary to select the main IGBT and snubber circuit each time the power converting system is formed and the cost thereof may be raised. In typical prior isolating pressure sensors the pressure responsive element, e.g. pressure gauge, senses the pressure of the process fluid flowing in the line through an intermediary sensing fluid isolated from the process fluid by a resilient pressure transmitting member. Such isolating pressure sensors may be used, for example, where contact with the process fluid (e.g. an acid) would damage the pressure gauge. U.S. Pat. No. 3,163,529 and No. 3,563,095 disclose examples.\nIn one type of prior isolating pressure sensor for interposition coaxially between flanged ends of a coaxially opposed pair of pipeline sections, a U-"} +{"output_text": " like to wear on a particular trip. This is especially difficult when traveling to different countries. The traveler may not be able to bring all of the pieces of jewelry they desire. The traveler may also not be able to afford to purchase all of the pieces of jewelry they desire.\nThe present invention relates to a method and apparatus for controlling a vehicle. In particular, the present invention relates to a method and apparatus for controlling a vehicle that is capable of controlling a vehicle in a stable manner even", "input_text": " with this solution to this hazardous flue gas problem are: 1) the integral oxidation zone functions autogenously to produce an acceptable effluent flue gas stream without the necessity of any particular command and control provisions; 2) there is no necessity to contaminate the coke-containing dual-function catalyst that is being subjected to regeneration with an undesired CO oxidation promoter that can compromise or inhibit its performance when it is returned to the reactor side of the unit; 3) there is no risk of environmental contamination since the hazardous material does not leave the modified apparatus; and 4) if a portion of the resulting effluent flue gas stream is used as a diluent in the combustion gas stream charged to the coke combustion zone of the regeneration unit then the coke combustion reactions are not inhibited as they would be by the presence of substantial amounts of CO in the absence of the present invention. The present invention relates to jewelry and, more particularly, to jewelry containing rotatable or reversible faces for exposing a different face.\nFor centuries, one of the most common indications of sophistication and personal style is jewelry. Even ancient royalty in Egypt were known to wear jewelry and be entombed with it. Only the rich could afford to purchase jewelry and it was worn as a sign of wealth. Individual pieces of jewelry were often custom made to cater to the style of the rich. A large collection of jewelry was desirable, of course, since more jewelry implied more wealth. Historically, consumers wanting different jewelry designs or motifs were forced to purchase different types of jewelry. The jewelry\"\"s cost, design, style and colors are characteristics that have consumers choosing one piece of jewelry over another. Yet, high costs of jewelry often make it very difficult for consumers to purchase each design and motif of jewelry they desire.\nTravelling with jewelry also poses another problem. People are forced to choose the pieces of jewelry they would"} +{"output_text": " of a cylinder, and is supported by a vane holder. The vane holder is fixed to a housing of the rotary compressor. The vane is supported by a vane holder, and is slidably inserted into a guide groove disposed in a radial direction of the cylinder. The vane is slidably inserted into the guide groove by a sliding operation of the vane.\nThe vane is slidably inserted into the guide groove by a sliding operation of the vane. The vane", "input_text": "FIG. 5 is a diagram 60 illustrating a typical receiver return loss mask (receiver return loss). To pass a typical common mode return loss mask, the impedance of the capacitor 55 at 10 MHz should be less than 50 ohms. This dictates a capacitor in excess of 225 pf, which is difficult to economically integrate into the termination network 41 due to large size. Further, integrating such a large capacitor also causes leakage and introduces a variety of manufacturing difficulties.\nAccordingly, a receiver termination network that overcomes these shortcomings is needed. The present invention relates to a multistage compression type rotary compressor in which an intermediate pressure refrigerant gas compressed by a first rotary compression element and discharged therefrom is sucked in a second rotary compression element, compressed and then discharged therefrom.\nIn this type of multistage compression type rotary compressor such as a high inner pressure type multistage compression rotary compressor, there has heretofore been a constitution in which a refrigerant gas is sucked in a low pressure chamber side of a cylinder from a suction port of a first rotary compression element, compressed by operations of a roller and a vane to obtain an intermediate pressure, and discharged from a high pressure chamber side of the cylinder to a discharge muffling chamber through a discharge port. Moreover, the intermediate pressure refrigerant gas discharged to the discharge muffling chamber is sucked in the low pressure chamber side of the cylinder from a suction port of the second rotary compression element, secondarily compressed by operations of a roller and a vane to constitute a high-temperature high-pressure refrigerant gas, and discharged into a sealed vessel from the high pressure chamber side through the discharge port and the discharge muffling chamber. Subsequently, the gas is discharged from the rotary compressor (see, e.g., Japanese Patent Application Laid-Open No. 2004-27970).\nEach vane is movably inserted into a guide grove disposed in a radial direction"} +{"output_text": " the reaction solvent is expensive, and therefore, the cost of the reaction system is high.\nThere is also a known method in which a phosphonitrile dichloride is reacted with an alkaline metal alcoholate in a reaction solvent of an aliphatic hydrocarbon having 6 to 9 carbon atoms (for example, see JP-A-2000-198793). However, the reaction time is long and the reaction temperature cannot be raised.\nThere is also a known method in which a phosphonitrile dichloride", "input_text": "). These methods use a quaternary ammonium salt in a large amount and the operation of collecting the quaternary ammonium salt is cumbersome. Moreover, since a lot of water is used at the time of a reaction, the reaction system is a biphasic system of water and an organic solvent, and a phosphonitrile dichloride tends to undergo hydrolysis and the reaction temperature cannot be raised. Therefore, a long time is needed for completion of the reaction. On the other hand, when the reaction temperature is raised in order to enhance the reactivity, the following problems arise such as hydrolysis becomes noticeable, phosphate traces derived from P\u2014OH portions generate and gelling tends to take place by a cross-linking reaction.\nThere is also a known method in which monochlorobenzene is used as a reaction solvent, and a cyclic phosphonitrile dichloride and an alkaline metal arylate and/or an alkaline metal alcoholate are reacted while the amount of moisture in the reaction system is controlled (for example, see JP-A-2000-198793). This method allows particles of the alkaline metal arylate and the alkaline metal alcoholate in the reaction solvent to be finely dispersed by reducing the amount of moisture at the time of preparing the alkaline metal arylate and the alkaline metal alcoholate, and thereby improves the reactivity. However, such improvement in the reactivity is still insufficient and the reaction time is long until completion.\nThere is a known method in which an aliphatic hydrocarbon having 6 to 9 carbon atoms is used as a reaction solvent and an alkaline metal alcoholate is prepared from an alkaline metal and an alcohol and then reacted with a phosphonitrile dichloride dissolved in monochlorobenzene (for example, see U.S. Pat. No. 3,939,228). It is possible to complete the reaction in a relatively short reaction time in this reaction. However, an alkaline metal is expensive and"} +{"output_text": "-doped indium oxide and a compound with at least one carboxyl group at its terminal position in the plasticized polyvinylacetal resin,\n(4) An interlayer film for laminated glass as described in (3) above, wherein the compound with at least one carboxyl group at its terminal position is one or more compounds selected from the group consisting of a carboxylic acid having 2 to 18 carbon atoms and a hydroxy carboxylic acid having 2 to 18 carbon atoms,\n(5", "input_text": " the efficiency of the electromagnetic wave shield xcex94dB in the wavelength of 10 to 2000 MHz of the laminated glass is not more than 10 dB,\n(25) A laminated glass as described in (20) to (24) above, wherein the laminated glass has a visible light transmittance rate (Tv) in the light rays of 380 to 780 nm, a solar radiation transmittance rate (Ts) in the light rays of 300 to 2500 nm, the haze value (H), the efficiency of electromagnetic wave shield(xcex94dB) in the wavelength of 10 to 2000 MHz and pummel value (P) as follows;\nTvxe2x89xa775%\nTsxe2x89xa60.8xc3x97Tv\nHxe2x89xa61.0%\nxcex94dBxe2x89xa610 dB\nP=a numeral from 3 to 7.\nAnd also, the present invention relates to;\n(1) An interlayer film for laminated glass, which is characterized by that an interlayer film for laminated glass is made from plasticized polyvinylacetal resin, and that tin-doped indium oxide and a compound with at least one carboxyl group at its terminal position are dispersed in the plasticized polyvinylacetal resin,\n(2) An interlayer film for laminated glass as described in (1) above, wherein the compound with at least one carboxyl group at its terminal position is one or more compounds selected from the group consisting of a carboxylic acid having 2 to 18 carbon atoms and a hydroxy carboxylic acid having 2 to 18 carbon atoms,\n(3) An interlayer film for laminated glass, which is made from plasticized polyvinylacetal resin, wherein said plasticized polyvinylacetal resin is prepared by dispersing tin"} +{"output_text": "er.\nA UICC is a smart card that is embedded with a microprocessor, a read-only memory (ROM), and an electrically erasable programmable read-only memory (EEPROM). A UICC is typically used for identification purposes and for storing credentials.\nAn embedded secure element is a secure element that is embedded in a mobile device. An embedded secure element is typically used for storing credentials and for executing applications.\nAn NFC enabler is a secure element that is implemented", "input_text": "The present invention relates to managing secure elements and more particularly to systems, methods, and computer program products for performing content management operations.\n2. Related Art\nA service provider (SP) is a company, organization, entity, or the like, that provides services to customers or consumers. Examples of service providers include account-issuing entities such as merchants, card associations, banks, marketing companies, and transit authorities. A service may be an activity, capability, functionality, work, or use that is permitted or provided by a service provider such as a payment service, a gift, offer or loyalty service, transit pass service, and the like.\nIn a mobile environment that involves contactless transactions between a mobile device and a service provider, information relating to the accounts and applications issued by the service providers must be downloaded onto mobile devices in order to enable them to perform the contactless transactions.\nA trusted service manager (TSM) is typically an independent entity serving mobile network operators (MNOs) and account-issuing service providers by provisioning applications, such as contactless applications associated with the service providers, to mobile devices. Typical TSMs can distribute and manage the contactless applications remotely because they have access to secure elements (SEs) in a near field communication (NFC) enabled mobile device.\nSecurity-critical applications, such as those involving payment and account credentials, require secure hardware storage and a secure execution environment. On mobile devices, this is usually handled by the secure element.\nThe secure element is a platform onto which applications can be installed, personalized and managed. It consists of hardware, software, interfaces, and protocols that enable the secure storage of credentials and execution of applications for payment, authentication, and other services.\nA secure element may be implemented in different form factors such as a Universal Integrated Circuit Card (UICC), an embedded secure element, or NFC enabl"} +{"output_text": " stack 3.\nA second \u201cthermopile junction\u201d is formed by a second aluminum trace M2 and a second polysilicon trace 12. The second polysilicon trace 12 is connected to the other end of the aluminum trace M1. The second polysilicon trace 12 is also connected to the other end of the aluminum trace M2. The second aluminum trace M2 is connected to a third aluminum trace M3. The third aluminum trace M3 is connected to a fourth aluminum", "input_text": "ilevered\u201d silicon dioxide membrane structures which extend from, or \u201coverhang\u201d from, a silicon base or the like. Such membrane structures tend to be fragile and therefore susceptible to damage during assembly operations that occur after formation of the membrane structures. For example, to achieve maximum sensitivity, the cavity openings covered by the SiO2 membrane in the infrared sensors described in the above mentioned Meinel et al. application should be as large as possible. The larger the cavity opening is, the more fragile the SiO2 membrane will be.\nThe closest prior art is believed to include the article \u201cInvestigation Of Thermopile Using CMOS Compatible Process and Front-Side Si Bulk Etching\u201d by Chen-Hsun-Du and Chengkuo Lee, Proceedings of SPIE Vol. 4176 (2000), pp. 168-178, incorporated herein by reference. Infrared thermopile sensor physics and measurement of IR radiation using thermopiles are described in detail in this reference. Prior Art FIG. 1A herein shows the CMOS-processing-compatible IR sensor integrated circuit chip in FIG. 1 of the foregoing article. \u201cPrior Art\u201d FIG. 1A herein is similar to that drawing, and Prior FIG. 1B herein shows the top perspective view of the same IR sensor integrated circuit chip illustrated in FIG. 2 of the foregoing article.\nReferring to Prior Art FIG. 1A herein, the IR sensor chip includes a silicon substrate 2 having a CMOS-processing-compatible dielectric (SiO2) stack 3 thereon including a number of distinct sub-layers. A N-type polysilicon (polycrystalline silicon) trace 11 and an aluminum trace M1 in dielectric stack 3 form a first \u201cthermopile junction\u201d where one end of the polysilicon trace and one end of the aluminum trace are connected. Additional oxide layers and additional metal traces also may be included in dielectric"} +{"output_text": " tank with a new ink tank.\nIn the conventional ink jet printer, the information relating to the ink tank is displayed on a display unit provided on the printer. However, in the case of the multifunction printer, the information relating to the ink tank is displayed on a display unit provided on the scanner. Therefore, the user cannot know the information relating to the ink tank unless the user goes to the printer.\nIn order to solve the above-mentioned problem, there is proposed a method wherein", "input_text": " results in additional installation space advantages, which permit an advantageous and therefore weight-optimized structural design of the vehicle body.\nOther objects, advantages and novel features of the present invention will become apparent from the following detailed description of one or more preferred embodiments when considered in conjunction with the accompanying drawing. 1. Field of the Invention\nThe present invention relates to a liquid container, more specifically to a liquid container wherein information relating to a state of the liquid container, such as a remaining amount of ink in an ink tank used on an ink jet printing apparatus is detected by a light-emitting means, for example, LED.\n2. Description of the Related Art\nRecently, as digital cameras have widely prevailed, uses are increasing wherein the printing is carried out while directly connecting a digital camera to a printer as a recording device without the intervention of a personal computer (PC). Such a printing is called as a \u201ccamera direct printing\u201d. Further, a printing method is also increasing wherein a card type information storing medium used for the digital camera in a detachable manner is directly mounted to a printer so that data are transferred and printed. This is called as a \u201ccard direct printing\u201d Also, a so-called multifunction printer has quickly been prevailing in the market, wherein a printer is integrated with a scanner to have a copying function without the intervention of PC, as well as the above-mentioned direct printing function.\nIn an ink jet printer, there are cases wherein a user desires to know information relating to individual ink tank such as a mounting state of the ink tank or a remaining amount of ink in the ink tank or it is desirable to inform such information to the user. For example, if the user knows that the remaining amount of ink in the ink tank is little, it is possible to avoid an accident wherein the printing is substantially impossible during the printing operation due to the lack of ink, by replacing the old ink"} +{"output_text": " a temperature in the range 60xc2x0 C. to 250xc2x0 C. for a period of time which is generally in the range 1 hour to 24 hours.\nThe catalyst of the invention can be used in the polymerization of olefins, in particular of ethylene, propylene, 1-butene, 1-hexene, 1-octene, 1-decene, 1-dodecene, 1-tetradecene, 1-hex", "input_text": " zeolite, for example, it is possible to impregnate this precursor with an aqueous solution of ammonium biborate and Rhodorsil ELP silicone from Rhxc3x4ne Poulenc, to dry at 80xc2x0 C., for example, impregnate with an ammonium fluoride Solution, then dry at 80xc2x0 C., for example, followed by calcining, preferably in air in a traversed bed, for example at 500xc2x0 C. for 4 hours.\nWhen the catalyst contains at least one group VIIA element, preferably fluorine, it is possible to impregnate the catalyst with an ammonium fluoride solution, to dry at 80xc2x0 C. for example, followed by calcining, preferably in air in a traversed bed, for example at 500xc2x0 C. for 4 hours.\nOther impregnation sequences can be used to obtain the catalyst of the invention.\nWhen the catalyst contains phosphorous, it is possible to impregnate the catalyst with a solution containing phosphorous, to dry, then to calcine.\nWhen the elements contained in the catalyst, i.e., at least one metal selected from the group formed by group VIII and group VIB metals, optionally boron, silicon, phosphorous, at least one group VIIA element, at least one group VIIB element, at least one group VB element, are introduced in a number of steps for impregnating the corresponding precursor salts, an intermediate step for drying the catalyst is generally carried out at a temperature which is generally in the range 60xc2x0 C. to 250xc2x0 C. and an intermediate catalyst calcining step is generally carried out at a temperature in the range 250xc2x0 C. to 600xc2x0 C.\nTo finish the catalyst preparation, the moist solid is left in a moist atmosphere at"} +{"output_text": " is shown in Table V.\nTABLE V ______________________________________ Preparation of Polymers of Class I, II, III, IV, V, VI, or VII by the Reaction of a Diamine with a Tricarboxylic Acid Anhydride and a Diamine with a Tricarboxylic Acid Anhydride and a Diamine with a Tricarboxylic Acid Anhydride and a Diamine with a Tricarboxylic Acid Anhydride and a Diamine with", "input_text": " tricarboxylic acid anhydride and un- or partially acylated aromatic diamines where the acylation level used is the amount that would give 50% acylation of all amine functionality utilized in the polymerization. By way of specific example, if a 3:1 ratio of aromatic to aliphatic diamine were used, all the aliphatic diamine would be diacylated and one third of the aromatic diamine functionality would be acylated. Polymer of Class V is prepared by the reaction of unacylated aliphatic, cycloaliphatic, or araliphatic diamines (which confines these amines to formation of two imide groups) with the tricarboxylic acid anhydride and fully or partially acylated aromatic diamines. The level of acylation can vary from 50% of the total amine functionality utilized in the reaction up to the aromatic diamine being fully diacylated. Polymer of Class VI is similar to the polymers of Class IV or V except that in polymers of Class VI the aliphatic diamines are used in both di imide or di amide formation and only about 50% of the total amine functionality is acylated. Polymers of Class VII are fully random in that both aliphatic diamine and aromatic diamine moieties are distributed between imide and amide portions and all trimellitoyl groups are free to be arranged head to head, tail to tail, or head to tail and 50 to 100% acylation of amine functionality is utilized. All of the foregoing polymers have an inherent viscosity in the range of 0.3 to 2.0 dl/g giving them molecular weights in the range of about 3,000 to 100,000. All these polymers can be injection molded and can be used as engineering plastics. They have excellent mechanical properties as shown in Table IV hereof.\nWith the use of Table II, a general method of preparation of these different polymers"} +{"output_text": " That is, the capacity of the tub is increased as the inner surface of the rear wall is positioned closer to the center of the tub.\nHowever, the conventional washing machine fails to solve the above problem.\nIn the conventional washing machine, the inner surface of the rear wall of the tub is positioned closer to the center of the tub. Therefore, the capacity of the tub is increased as the inner surface of the rear wall is positioned closer to the center of the tub.\nHowever, the", "input_text": " and a radial rib 321 is also formed on the outer surface of the rear wall in a radial direction. The farther the radial rib 321 is from the center of the rear wall, the lower the height thereof is. Thus, the rear wall of the tub may have a problem in its strength and rigidity. Furthermore, the distance between the radial ribs is gradually getting wider from the center thereof. Thereby, the strength of the rear wall of the tub is not high enough and it causes a structural-strength-related problem in the farthest portion of the rear wall from the center.\nAlso, the thickness between the circumferential rib 311 formed nearer to the center of the rear wall and the circumferential rib 312 formed farther from the center is the same. Since the nearer portion to the center is more affected by the vibration of the shaft than the farther portion, the nearer portion is needed to have much strength and rigidity. Nevertheless, the conventional washing machine fails to solve the above problem.\nMoreover, according to the conventional washing machine, the circumferential rib 311 and 312 and the radial rib 321 are formed relatively high to enhance the strength of the rear wall of the tub. However, the higher the ribs are, the more flexible the ribs are. Thereby, the strength or rigidity is not gained as high as structurally required and it causes a structural-strength-related problem as well as high production cost.\nMeanwhile, since the inner surface (not shown) of the rear wall of the tub is formed as a smooth surface without ribs, the reinforcement for the strength is dependent on the ribs on the outer surface only. That is one of the reasons why the ribs on the outer surface are high in the conventional washing machine.\nThe capacity of the tub for holding wash water therein influences the capacity of the washing machine. Here, the capacity of the tub is influenced by the position of the inner surface of the rear wall."} +{"output_text": " a pinned photodiode 13. The CMOS APS cell 10 includes a transfer transistor 11, a floating diffusion region 7, a reset transistor 12 and a source follower transistor 8. The photodiode 13 is formed as a pinned photodiode, which means that it has a pinning layer 20, a charge collection region 21, a charge transport region 22 and a pinning layer 23. The pinned photodiode 13 includes a p-n-p photodiode structure having a p-", "input_text": " provided by feeding back these signals to the programmable logic via feedback lines in the macrocells.\nAn object of the present invention is to provide digital logic circuit combining the functional flexibility of programmable logic devices and the fast operation of fixed CMOS combinatorial logic to perform flexible logic on a fast input. CMOS imagers are increasingly being used as low cost imaging devices. A CMOS imager circuit includes a focal plane array of pixel cells. Each of the pixel cells includes a photosensitive element, such as a photodiode, photogate, or photoconductor overlying a doped region of a substrate for accumulating photo-generated charge in an underlying portion of the substrate. A readout circuit is connected to each pixel cell and often includes a floating diffusion region for receiving charge from the photosensitive element, and a source follower transistor, which has a gate electrically connected to the floating diffusion region. The imager may also include at least one transistor for transferring charge from the photosensitive element to the floating diffusion region, and a transistor for resetting the floating diffusion region to a predetermined charge level prior to charge transfer. A row select access transistor is also typically used to gate a pixel output signal produced by the source follower transistor. The pixel cell above is often called a CMOS Active Pixel Sensor (APS) cell, which is used to collect light energy and convert it into a readable electrical signal.\nA schematic top view of a portion of a semiconductor wafer fragment containing one exemplary CMOS APS cell is shown in FIG. 1. This CMOS APS cell 10 is a four transistor cell. As it will be described below, the CMOS APS cell 10 shown includes a photodiode 13 formed within a substrate. This photodiode 13 is formed as a pinned photodiode shown in FIG. 2. Alternatively, the CMOS APS cell 10 may include a photogate, photoconductor or other photon to charge converting device, in lieu of"} +{"output_text": " to Bezier curves, offset curves have not been widely used in the computer graphics community. This is primarily due to the fact that the computational complexity of offset curves is generally higher than that of Bezier curves. In addition, the computational complexity of offset curves is generally higher than that of Bezier curves for the same number of control points.\nThe computational complexity of offset curves is generally higher than that of Bezier curves for the same number of control points. This is due to the fact that the", "input_text": "/shaded models in a raster graphics display, wherein the inputs to a transformation processor are the parameters for a rational Bezier surface.\nOffset curves have also received considerable attention, primarily in the CAD community (see generally, Farin, G., Curvature Continuity and Offsets for Piecewise Conics, TOG 8, 2 (April 1989), pp. 89-99; Farouki, R., and Neff, C., Algebraic Properties of Plane Offset Curves, CAGD, p. 297-299; Farouki, R., and Neff, C., Analytic Properties of Plane Offset Curves, CAGD, pp 297-299; Hoschek, J., Spline Approximation of Offset Curves, CAGD 5 (1988), pp. 33-40; Klass, R., An Offset Spline Approximation for Plane Curves, CAD 15, 5 (September 1983), pp. 297-299). Various definitions of offset curves exist. The predominant definition, however, is one based on a centerline or \"generator\" curve with an offset a distance w defined along a unit normal to each point of the centerline curve. In the absence of a sign convention for the unit normal, an offset curve is defined on both sides of the centerline curve. Generally, the term \"left offset curve\" designates the offset curve based upon a unit normal given by a positive sign, while the term \"right offset curve\" designates the offset curve based upon a unit normal by the negative sign. Further, the term \"offset curve\" is used to refer to one or the other of the left and right offset curves, while the term \"offset curve pair\" indicates both the left and right offset curves.\nWhile the increased generality of offset curves makes them a more powerful means of expressing curves and surfaces as compared"} +{"output_text": "In the image pickup apparatus shown in FIG. 20, the vibration signal forming circuit 211 detects the vibration of the body of the image pickup apparatus and forms a vibration signal. The vibration signal is supplied to the vibration correcting circuit 212. The vibration correcting circuit 212 removes a low-frequency component from the vibration signal and outputs a vibration signal having a predetermined frequency. The vibration signal is supplied to the adder 218. The adder 218 adds the vibration signal to the feedback signal from the position detecting sensor 207", "input_text": "theta. is corrected by an optical processing, so that the object image is formed on the image sensor 205 as a light flux having no shaking.\nFIG. 20 is a block diagram showing the arrangement of a conventional image pickup apparatus which corrects an image shake by means of the image pickup optical system 200.\nIn the image pickup optical system shown in FIG. 20, when a power supply switch 208 is turned on, a mode microcomputer 209 notifies a main microcomputer 210 of the turning-on of the power supply switch 208. Then, having determined that the power supply has been turned on, the main microcomputer 210 starts its control operation.\nSubsequently, a vibration signal forming circuit 211, which has detected the vibration of the body of the image pickup apparatus, forms a vibration signal and supplies the vibration signal to a vibration correcting circuit 212. In the vibration correcting circuit 212, the analog vibration signal is converted into a digital vibration signal by an A/D converter 213, and, then, a predetermined low-frequency component is removed from the digital vibration signal by a high-pass filter (HPF) 214. After that, the phase and gain of an output signal of the HPF 214 are corrected by a phase/gain correcting circuit and an output signal of the phase/gain correcting circuit 215 is integrated by an integration circuit 216 to calculate and output a correction target value.\nThe correction target value outputted from the vibration correcting circuit 212 is converted into an analog value by a D/A converter 217 and is then supplied to an adder 218. At the adder 218, the analog correction target value is added to a feedback signal supplied from the position detecting sensor 207 through an amplifier 219. Then, an output signal of the adder 218 is supplied to a driving circuit 220. The driving circuit 220 issues a driving signal to the actuator 206 to drive the shift lens 203.\n"} +{"output_text": " focal conic (F) orientation and homeotropic (H) orientation. The planar orientation is a state in which the molecules are oriented in parallel to the substrate surface, the focal conic orientation is a state in which the molecules are oriented in parallel to the substrate surface but are inclined with respect to the substrate surface, and the homeotropic orientation is a state in which the molecules are oriented perpendicular to the substrate surface.\nThe planar orientation is a state in which the molecules are oriented in parallel to", "input_text": " glass (FSG) approximately 10 to 30 xcexcm thick. After idle (e.g., 20 minutes without processing), the chamber is plasma cleaned and a thin FSG layer of approximately 1 to 3 xcexcm is deposited. The plasma clean and deposition of the thin FSG layer is repeated periodically, e.g. every 12 hours, to maintain the chamber in an equilibrium state.\nThe present invention will be more fully understood when taken in light of the following detailed description taken together with the accompanying drawings. 1. Field of the Invention\nThe present invention relates to a polymer/cholesteric liquid crystal dispersion which is utilized for display elements, image/information recording elements and spatial light modulators, to a method of producing the dispersion and to a liquid crystal display element utilizing the dispersion.\n2. Description of the Related Art\nA cholesteric liquid crystal display element has, for example, the characteristics that it has a memory storing ability which can retain a display without any power source, it has the ability to obtain a bright display because no polarizing plate is used, and it enables color displaying without using a color filter. Attention has been therefore focused on such display elements in recent years (see, for example, Japanese Patent Application Laid-Open (JP-A) No. 05-080303).\nA cholesteric liquid crystal, in particular, is made of rod-shaped molecules oriented spirally and reflects interference light having a wavelength which corresponds to the spiral pitch (called selective reflection). It therefore has the characteristics that bright color display is possible without using any color filter by designing the spiral pitch to have a length corresponding to the wavelength of a red color, a green color or a blue color.\nCholesteric liquid crystal sealed into a cell constituted by a pair of substrates provided with electrodes is known to take any of three types of oriented states: planar (P) orientation,"} +{"output_text": " is not so high. On the other hand, in the latter, namely, in the internally threaded member made of a rod, as viewed in a cross-section perpendiacular to the axis of the shank, the reinforcing effect of the fiber filaments is high, but the reinforcing effect of the first strands is low.\nIn the case of the externally threaded member, the reinforcing effect of the fiber filaments is low, and the reinforcing effect of the first strands is high, and, therefore", "input_text": "ohardening resin in a solvent therefor, heating the above-obtained resin impregnated cloth to remove the solvent, thereby, to obtain a half-cured, substantially non-tacky sheet (hereinafter referred to as \"prepreg\"), rolling the thus obtained prepreg, heating and pressing the rolled prepreg in a mold to prepare a rigid rod having a circular cross-section, and threading the surface of the rigid rod. On the other hand, when an externally threaded FRP member with a diameter less than about 20 to 25 mm is intended, the threaded member may be produced as follows. According to the so-called prepreg-press method, a predetermined number of the above-mentioned prepregs 1 are piled up as shown in FIG. 1(A) which will be mentioned later, the piled-up prepregs are softened by heating, and the piled-up prepregs are subjected to molding by means of a split die 2 as shown in FIG. 1(B), which will be mentioned later, to prepare a rod 3. The rod thus prepared is subjected, on its peripheral surface, to threading to obtain an FRP made threaded member. As is apparent from the above, both the above-mentioned threaded members are each composed of a thermohardened resin and a yarn cloth comprising a plurality of first strands of fiber filaments arranged substantially in parallel and extending longitudinally of the axis of the shank and a plurality of second strands of fiber filaments arranged substantially in parallel and substantially in perpendicular relation to said first strands. However, in the former, namely, in the externally thread member made of a rolled prepreg, as viewed in a cross-section perpendiacular to the axis of the shank, only one first strand extend from the center of the shank into the thread in a rolled manner and, therefore, the reinforcing effect of the fiber filaments"} +{"output_text": " M. K. Kandere-Grzybowska, M. K. Kandere-Grzybowska, M. K. Kandere-Grzybowska, M. K. Kandere-Grzybowska, M. K. Kandere-Grzybowska, M. K. Kandere-Grzybowska, M. K. Kandere-Grzybowska, M. K. Kandere-Gr", "input_text": " B. Geiger and J. P. Spatz, Biophysical journal, 2007, 92, 2964-2974).\nEndothelial cells (EC) spread poorly on flat, rigid implant surfaces such as those of commonly used stents for the treatment of the effects of coronary artery disease (S. Garg and P. W. Serruys, J Am Coll Cardiol, 2010, 56, S1-42; M. Hristov, A. Zernecke, E. A. Liehn and C. Weber, Thromb Haemost, 2007, 98, 274-277). Topographic modifications of the surface with micron and sub-micron scale structures may accelerate onset of spreading as well as subsequent topography-guided cell polarization (contact guidance), which are requirements for the re-establishment of a differentiated endothelium. Indeed, surface texturing of known biocompatible materials represents a promising strategy to modulate cellular processes which are essential for the development or regeneration of functional tissues (F. Guilak, D. M. Cohen, B. T. Estes, J. M. Gimble, W. Liedtke and C. S. Chen, Cell Stem Cell, 2009, 5, 17-26; F. Variola, F. Vetrone, L. Richert, P. Jedrzejowski, J. H. Yi, S. Zalzal, S. Clair, A. Sarkissian, D. F. Perepichka, J. D. Wuest, F. Rosei and A. Nanci, Small, 2009, 5, 996-1006). Studies have shown that surface modifications profoundly influence almost all tested cellular activities from cell polarization and migration to gene expression profile, differentiation, and apoptosis (K. Kandere-Grzybowska,"} +{"output_text": " the resolution of a digital image is defined by the number of pixels in a given area of the image. For example, a digital image with a resolution of 300 pixels per inch (ppi) has a resolution of 300 pixels per inch. The resolution of a digital image is usually expressed in terms of dots per inch (dpi). For example, a digital image with a resolution of 300 dpi has a resolution of 300 dots per inch.\nIn the past, when a user wanted to obtain", "input_text": "28225 need to perform the calculation of the edge sharpness and coarseness as well as the selection from the three filters. This complicates the operations done by the unit as a whole and hence, the unit has a complicated configuration. In addition, the MTF correction unit selectively switches between the smoothing filter and the edge emphasizing filter based on the reference values and therefore, density variations may occur in a portion consisting of pixels with the values of edge sharpness and coarseness close to the reference values. 1. Field of Invention\nThe invention relates generally to digital image processing systems. More particularly, methods and apparatus for selectively processing a digital image are disclosed. More particularly, the invention provides techniques that provides a recipient of a low-resolution composited digital image (such as a greeting card or calendar) a way to automatically obtain a high-resolution rendering of the received composited digital image.\n2. Description of Relevant Art\nWhen a user performs image operations using digital image processing programs such as Adobe Photoshop(trademark) or PhotoDeluxe(trademark), image operations are performed directly on the raw pixels of the image. Since most imaging applications only perform image operations on one resolution, usually the highest resolution, these operations are sometimes very slow, even on the fastest computers. If an application could work at a lower resolution for display purposes, the processing time would significantly decrease, thus increasing the productivity of the user. While it is sometimes possible for an application to work on lower-resolution image data, when the image with all the applied image operations is to be saved, the full-resolution image data must be processed at that time. If this step were not performed, the saved image would only contain low-resolution image data. While this is one option, this is not desired since it would not be possible to obtain an image rasterized at a higher resolution.\nIn general,"} +{"output_text": "-color screen, switched or otherwise, the intermediate subassembly 50 and the anode subassembly 62.\nFIG. 6 is a section along a plane perpendicular to the columns, at two different points on the screen (for example at the top and at the bottom).\nIn this figure, the intermediate subassembly 50 bears the general reference 60 and is depicted schematically with regions 62 corresponding to the central part of the pixels, relatively transparent areas, and half-regions 64 corresponding to the lateral half-", "input_text": " parts 42a, 42b, 42c, 42d.\nThis complex periodic structure of the intermediate subassembly, superimposed on the structure, also periodic, of the anode subassembly (in terms of emission as previously defined) can lead to display defects due to moire effects. These defects are illustrated in FIGS. 5a and 5b, on the one hand, and in FIG. 6 on the other hand.\nFIGS. 5A and 5B, first of all, correspond to the case where the intermediate subassembly and the anode subassembly are not strictly aligned. This appears when there are several bands of luminescent materials (three-color screen, switched or otherwise). It is assumed that the columns of luminescent materials disposed on the anode are not strictly parallel to the addressing columns of the intermediate subassembly. FIGS. 5A and 5B are sections along a plane perpendicular to the columns, at two different points on the screen (for example at the top and at the bottom).\nIn these two figures, the intermediate subassembly bears the general reference 50 and is depicted schematically with regions 52 corresponding to the central part of the pixels, relatively transparent areas, and half-regions 54 corresponding to the lateral half-areas, less transparent; the anode subassembly is depicted in the form of the emissive luminous band 62.\nFIGS. 5A and 5B depict, by way of example, the case of a three-color screen, switched or otherwise.\nSince the light is not transmitted from the luminescent bands to the observer in the same way from one end of the column to the other, the image perceived will be interfered with by lines or fringes which are more or less bright and coloured. This moire effect is a nuisance to the observer.\nFIG. 6 also shows, in the case of both a monochrome screen and a three"} +{"output_text": " which is a process of adding a small amount of a plating solution to a large amount of water, with the intent of removing the contaminants from the plating solution. The dummying process is usually performed in a separate tank, and the treated solution is then returned to the plating tank.\nThe plating solutions are usually prepared by dissolving the metal salts in water, with the pH adjusted to the desired level. The metal salts may be added to the water in the form of a soluble salt, such as", "input_text": " important that the substrate surface be absolutely clean and receptive to the plating, generally requiring that substrates being electroplated be prepared prior to electroplating. In the effort to get the substrate into this condition, several separate steps may be required, such as soak cleaning, followed by electrocleaning, followed by rinsing.\nFormulations of plating baths may be flexible in some systems and very sensitive to variations in others, many of the more recent changes resulting from waste treatment and safety requirements. Besides the ability to deposit a coating having acceptable appearance and physical properties, the desired properties of the plating bath include: high metal solubility, good electrical conductivity, good current efficiencies for anode and cathode, noncorrosivity to substrates, nonfuming, stable, low hazard, low anode dissolution during down-time, good throwing power, good covering power, wide current density plating range, ease of waste treatment, and economical to use. Few formulas have all these attributes, with only a few plating solutions being used commercially without special additives, to brighten, reduce pitting, and/or otherwise modify the character of the deposit or performance of the solution to meet some of the criteria above, with the suppliers of the proprietary additives normally specifying the preferred formulations to be used.\nPurification, often needed once a plating bath is prepared, is used periodically to maintain the plating solutions Alkaline zinc plating solutions are sensitive to a few mg/L of heavy-metal contamination, which may be precipitated using sodium sulfide and subsequently filtered out. Nickel plating solutions may contain excess iron, which may be removed by a process involving peroxide oxidation, precipitation at a pH of about 5, and filtration of the iron. The more complex, less water-soluble organic contaminants, along with some trace metals, are removed with activated carbon treatments in separate treatment tanks. A common purification treatment used both on new and used plating solutions is dummying,"} +{"output_text": " to install. For example, wood grilles are typically installed by nailing or screwing the grille into the return air opening. This requires the installer to drill holes in the return air opening and then to install the wood grille. The holes must be drilled in the return air opening at the exact location of the standard-sized filter opening. This requires the installer to measure the location of the standard-sized filter opening and then to drill the holes at the exact location of the standard-sized", "input_text": " housing designed to hold both the grille and a standard-sized filter and be screwed or nailed into a return air duct opening. The second part is either (i) a hinged grille attached to the framed housing or (ii) a removable grille with hardware that \u201ccatches\u201d movable hardware (such as rotatable latches, pins, and screws), holes or \u201cdimples\u201d in the framed housing.\nCurrently, the most common return air grille assemblies are designed using only one material, both the framed housing and the hinged or removable grille. Materials most commonly used for such assemblies are metal, such as aluminum or steel, where the assemblies are generally manufactured by \u201cstamping\u201d relatively thin metal, often as thin as 0.030.\u2033 Further, HVAC systems are typically designed with sizes contemplating the use of standard-sized return air openings, standard-sized removable air filters, and correspondingly sized steel or aluminum return air grilles. For example, one industry standard-sized opening has a nominal size of 20\u2033\u00d725\u2033. The actual opening in the structure is about \u215b\u2033 to \u00bd\u2033 larger across the face of the opening, such as side to side, to allow the housing to fit therein. The filter size also has a nominal size also of 20\u2033\u00d725\u2033 with an actual size of about \u00bc\u2033 to about \u00bd\u2033 less across the face of the filter to fit inside the relatively thin housing. The next smaller standard-sized opening for both dimensions is 18\u2033\u00d720\u2033 and the next larger standard-sized opening is 20\u2033\u00d730\u2033.\nIn less common instances, wood or wood-like material is used, which includes both (i) the visible grille and (ii) the framed housing that affixes at the return air opening. However, current wood grille designs pose installation problems and often require additional costs"} +{"output_text": "ization of the feature sizes is limited by the resolution of the optical system.\nIn order to attain the resolving power of 8 k\u00d74 k pixels, the GG projector 48 utilizes two G devices (G1 device and G2 device) each having 4 k\u00d72 k pixels. In common with the G1 and G2 devices, respective pixels are arranged at intervals of pitch Px in the horizontal direction and pitch Py in the vertical direction. As shown in FIG. 4, respective pixels forming", "input_text": " modulated by the G1G2 image is projected by a GG projector 48, forming an image on the screen 49.\nIn order to attain the resolving power of 8 k\u00d74 k pixels, the G1G2 projector 48 utilizes two G devices (G1 device and G2 device) each having 4 k\u00d72 k pixels. In common with the G1 and G2 devices, respective pixels are arranged at intervals of pitch Px in the horizontal direction and pitch Py in the vertical direction. As shown in FIG. 4, respective pixels forming the whole G1 device are shifted from respective pixels forming the whole G2 device by Px/2 in the horizontal direction and by Py/2 in the vertical direction. That is, while inputting signals meeting with the resolving power of G1G2 to the GG projector 28, respective images from the G1 and G2 devices are overlaid on each other at a slant of 45 degrees by half pixel, whereby the resolving power equivalent to 8 k pixels is attained. On the other hand, the RB projector 47 utilizes an R device and a B device each having 4 k\u00d72 k pixels.\nThat is, it is difficult structurally to fabricate an optical system where 2 channels of green images are provided by a single projector. Therefore, the proposed projector of FIG. 3 adopts the shown constitution composed of the GG projector 48 for G1, G2 and the RB projector 47 as a result of eliminating a G-component from the RGB projector. In this projector, by projecting images from two projectors 47, 48 in stack and further combining respective images with each other onto the screen 49, a high resolving power (high-definition) can be attained. The development of semiconductor memory technology is essentially driven by the requirement for increasing the performance of the semiconductor memories in conjunction with miniaturization of the feature sizes. However, further miniatur"} +{"output_text": " the semiconductor industry has been developing a variety of packaging techniques to meet the requirements of the submicron era.\nIn the packaging process, a semiconductor chip is usually mounted on a substrate, and then the substrate is encapsulated with a molding compound. The molding compound is usually formed by a transfer molding process. The molding compound is usually formed by injecting a molding compound into a mold cavity. The molding compound is then cured to form a molding compound layer. The molding compound layer is then transferred to a substrate", "input_text": " to the existing GGSN node itself The client can still operate normally, and does not have to know about the proxy server, therefore supporting the handover with no or minimum changes to the existing client itself. If the DNS server for a selected APN does not point to the proxy server, then conventional operation will occur, and the rest of the network is not impacted. The handover mechanisms enabled by the proxy server and found in some embodiments of the inventions preserve the address of the client with minimal messaging overhead. Consistent connections may therefore be maintained and optimized even when data is routed through varying networks. A centralized proxy server can optionally maintain records for billing and usage purposes for all varying services (especially in the case where the proxy server is handling both data and control). A centralized proxy server can optimize traffic flow over a wide range of networks. A centralized proxy server can maintain a unique identifier while a client travels through different ISP's This application claims the priority benefit of Taiwan application serial no. 89125978, filed Dec. 6, 2000.\n1. Field of the Invention\nThe present invention relates generally to packaging techniques, and more particularly, to a multimedia chip package and a fabrication method thereof.\n2. Description of the Related Art\nBecause of the rapid development of information technology, integrated circuits have become essential for daily use. Products formed by integrated circuits appear in all aspects of daily life. Semiconductor technology is continuously developing to be more personal and functional, and complex new electronic products are constantly being developed. As a result, a major trend in the semiconductor industry is to make products even lighter, thinner, smaller and faster, while also making them friendly, convenient, powerful, reliable and increasingly less expensive.\nIn semiconductor fabrication, semiconductor devices are entering the submicron century by producing 0.18 xcexcm integrated circuits. As a matter of fact,"} +{"output_text": " be able to access the network and communicate with the telephony services without regard to the underlying network platform.\nThe JTAPI specification defines a set of \u201ccore\u201d APIs that are common to all JTAPI providers. The core APIs are: JTAPI_Initialize( ) JTAPI_Terminate( ) JTAPI_GetStatus( ) JTAPI_GetError( ) JTAPI_GetErrorString( ) JTAPI", "input_text": "3, June, 1999). All of these documents are incorporated herein by reference. JTAPI uses a \u201ccore plus extensions\u201d structure, in which the \u201cCore JTAPI\u201d package includes the basic call object model used in placing, answering and terminating telephone calls, while the extension packages add features required by more advanced applications.\nFIG. 1 is a block diagram that shows the basic elements of the JTAPI environment, as they are known in the art. JTAPI enables application vendors to write an application 20 that will provide value-added telephony services to the user, independent of the type of network 22 and communication protocol stack 24 that are used to carry the services. Examples of Java telephony applications (as listed on the above-mentioned JTAPI Web site) include: Call logging and tracking Auto-dialing Screen-based telephone applications Screen-pop software Call routing applications Automated attendants Interactive voice response (IVR) systems Agent software Call center management Fax send and receive Voice mail.These applications are listed by way of example, and by no means represent an exhaustive list of such applications. \nIn order to enable such telephony application services, the network provider must implement JTAPI provider software 28 that exposes an application programming interface (API) 26 complying with JTAPI specifications. The same API is exposed regardless of the underlying network platform: for example, network 22 may be a circuit-switched network, such as a public switched telephone network (PSTN) with a SS7 protocol stack or a private branch exchange (PBX) using proprietary protocols, or it may be a packet-switched network, such as an Internet Protocol (IP) network using a H.232 stack to carry voice over IP (VoIP). Because API 26 is uniform among all network types, application 20 should"} +{"output_text": " Escherichia coli, plasmid originated from yeast, plasmid originated from insect, plasmid originated from animal, plasmid originated from plant, plasmid originated from virus, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage, plasmid originated from bacteriophage,", "input_text": "-1, which contains cDNA coding total amino acid sequence of human Serrate-1 of the present invention, is transformed into E. coli JM109, has been deposited in the National Institute of Bioscience and Human-Technology, Agency of industrial Science and Technology, MITI, of 1-1-3, Higasi, Tsukuba-shi, Ibaragi-ken, Japan, as E. coli: JM109-pUCSR-1. Date of deposit was October 28, 1996, and deposition No. is FBRPM BP-5726.\nExprssion and purification of various forms of human Delta-1 and human Serrate-1 using cDNA coding amino acid sequence of human Delta-1 and human Serrate-1 isolated by the above methods are known in the references (Kriegler, Gene Transfer and Expression- A Laboratory Manual Stockton Press, 1990 and Yokota et al. Biomanual Series 4, Gene transfer and expression and analysis, Yodosha Co., 1994). A cDNA coding the amino acid sequence of the isolated said human Delta-1 and human Serrate-1 is ligated to preferable expression vector and is produced in the host cells of eukaryotic cells such as animal cells and insect cells or prokaryotic cells such as bacteria.\nIn the expression of human-Delta-1 and human Serrate-1 of the present invention, DNA coding polypeptide of the present invention may have the translation initiation codon in 5xe2x80x2-terminal and translation termination codon in 3xe2x80x2-terminal. These translation initiation codon and translation termination codon can be added by using preferable synthetic DNA adapter. Further for expression of the said DNA, promoter is linkaged in the upstream of the DNA sequence. Examples of vector are plasmid originated from Bacillus, plasmid originated from"} +{"output_text": " in the range 0.1 MPa to 10 MPa. The hourly space velocity is generally over 100 hxe2x88x921 and usually in the range 100 hxe2x88x921 to 1000 hxe2x88x921. The hydrogen recycle ratio is generally over 1 and usually in the range 1 to 10.\nThe catalyst of the present invention can also be used for hydrocracking vacuum distillate type cuts which are highly charged with nitrogen.", "input_text": " in the range 350xc2x0 C. to 580xc2x0 C. (i.e., corresponding to compounds containing at least 15 to 20 carbon atoms). They generally contain heteroatoms such as sulphur and nitrogen. The nitrogen content is usually in the range 1 to 5000 ppm by weight and the sulphur content is in the range 0.01% to 5% by weight.\nThe catalyst of the present invention can advantageously be used for hydrocracking vacuum distillate type cuts which are highly charged with sulphur and nitrogen.\nThe catalysts of the present invention preferably undergo sulphurisation to transform at least part of the metallic species to the sulphide before bringing them into contact with the feed to be treated. This activation treatment by sulphurisation is well known to the skilled person and can be carried out using any method already described in the literature, i.e., either in the reactor or ex-situ.\nOne conventional sulphurisation method which is well known to the skilled person consists of heating in the presence of hydrogen sulphide (pure or, for example, in a stream of a hydrogen/hydrogen sulphide mixture or a nitrogen/hydrogen sulphide mixture) to a temperature in the range 150xc2x0 C. to 800xc2x0 C., preferably in the range 250xc2x0 C. to 600xc2x0 C., generally in a traversed bed reaction zone.\nThe hydrocracking conditions such as temperature, pressure, hydrogen recycle ratio, and hourly space velocity, can vary widely depending on the nature of the feed, the quality of the desired products and the facilities available to the refiner. The temperature is generally over 200xc2x0 C. and usually in the range 250xc2x0 C. to 480xc2x0 C. The pressure is over 0.1 MPa and usually"} +{"output_text": "\n(ii) binding a carbon nanotube to each of said binding sites.\nThe walls are formed by anodically oxidizing aluminum. The conductive surface is preferably a layer containing at least one element selected from the group consisting of titanium, zirconium, niobium, tantalum, molybdenum, copper and zinc.\nThe invention has an object to provide an electron emission device giving a large quantity of electron emission and having a high performance.\nSpecifically,", "input_text": " carbon nanotube device in which the carbon nanotube binds to a conductive surface so that conduction is maintained therebetween, and the carbon nanotube has a high directivity.\nFurther, the invention has an object to provide an electron emission device giving a large quantity of electron emission and having a high performance.\nSpecifically, there is provided a carbon nanotube device comprising a support having a conductive surface and a carbon nanotube, one of whose terminus binds to said conductive surface at a site so that conduction between said conductive surface and said carbon nanotube is maintained, wherein a root of said carbon nanotube where said carbon nanotube binds to said conductive surface is surrounded by a wall.\nForming the barrier with a layer containing alumina or silicon is preferable with a view to achieving a higher density of the carbon nanotubes binding to the conductive surface. The wall containing alumina is available, after forming an aluminum thin film on the conductive surface, for example, by anodically oxidizing aluminum. At this point, the conductive surface should preferably comprises a layer containing at least one element selected from the group consisting of titanium, zirconium, niobium, tantalum, molybdenum, copper and zinc. It is not necessary that the conductive surface be previously protected even during anodic oxidation of the aluminum thin film.\nThere is also provided, a manufacturing method of a carbon nanotube device comprising a support having a conductive surface and a carbon nanotube, one of whose terminus binds to said conductive surface at a site so that conduction between said conductive surface and said carbon nanotube is maintained, wherein a root of said carbon nanotube at the site where said carbon nanotube binds to said conductive surface is surrounded by a wall, said method comprising the steps of:\n(i) forming a plurality of carbon nanotube binding sites isolated from each other by walls on said conductive surface; and"} +{"output_text": " is the most common solution to the problem.\nIn the second solution, U.S. Pat. No. 5,721,917 (K. S. Kim et al.) discloses a process and apparatus for protecting a backside of a wafer from a deposition process by use of a protective layer of silicon nitride. However, the silicon nitride layer is deposited on the backside of the wafer by use of a chemical vapor deposition (CVD) process. The CVD process is not a", "input_text": " (W) and produces additional HF. The HF then reacts with the native oxide which causes additional polysilicon to be exposed to tungsten flouride (WF.sub.6).\nSome of these partially coated backside materials become detached in subsequent processes and form particulates which can cause fatal defects in the evolving semiconductor devices. Also excessive uneven buildup of adhering deposited material on the backside of the wafer can deplanarize the backside, rendering the backside ineffective as a planar datum to assure accurate processing of the frontside, such as maintaining a consistent depth of focus during a photolithographic exposure operation. With respect to the current invention the problem is that a thin film of tungsten is deposited regionally on a backside layer of polysilicon which prevents subsequent stripping of the polysilicon results in a nonplanar wafer backside.\nTo avoid these kinds of problems, firstly, the unprotected backside of the wafer could be subsequently stripped of its undesirable deposited materials, resulting in an additional manufacturing step which adds to the cost and exposes the wafer to additional yield detracting handling. Or, secondly, the wafer backside could be protected from the deposition processes.\nIn one solution of the first kind, U.S. Pat. No. 5,384,008 (Ashok Sinha et al.) discloses a process and apparatus which utilizes a necessary subsequent etching process step to remedy a backside deposition problem by use of a reduced size wafer pedestal. However, the deposited material removal is limited to a generally annular area at the periphery of the backside of the wafer. It does not address the problem of deposits near the center of the backside of the wafer which is the problem necessitating the present invention, Furthermore, it does not address the general problem of being an added costly manufacturing step for depositions not requiring an immediate subsequent etching step. Yet, this"} +{"output_text": " method, a magnetic pole of a rotor magnet is increased, and a magnetic pole of a stator magnet is decreased. In the second method, a magnetic pole of a rotor magnet is increased, and a magnetic pole of a stator magnet is increased.\nIn the first method, a magnetic pole of a rotor magnet is increased, and a magnetic pole of a stator magnet is decreased. In the second method, a magnetic pole of a rotor magnet is increased, and a magnetic pole of a stator magnet is", "input_text": " invention, a method of aligning the optical and electrical scan in a scrolling color projector includes providing a reference phase signal indicative of timing information for presentation of an optical scan and an electrical scan to a light valve, providing an optical scan and an electrical scan on the light valve in accordance with the reference phase signal, determining a difference in phase between the optical scan and the electrical scan on the light valve, storing the difference in phase between the optical scan and the electrical scan, and providing the optical scan to the light valve based on a combination of the difference in phase and the reference phase signal and providing the electrical scan to the light valve based on the reference phase signal.\nA preferred form of the method and apparatus for alignment of the optical and the electrical scan in a scrolling color projector as well other embodiments, objects, features and advantages of the present invention will become readily apparent from the following detailed description of illustrative embodiments, which is to be read in connection with the accompanying drawings. 1. Field of the Invention\nThe present invention relates to a rotor structure of a claw pole type of actuator of a single phase structure, and more specifically, it relates to a rotor magnet structure and a magnetization pattern of an actuator that is inexpensive, is easy to assemble, and has stable characteristics of repetitive rotational operation.\n2. Description of the Prior Arts\nIn an actuator in which an electric rotating machine of a single phase structure with a claw pole type of structure is provided with a stop mechanism so as to enable a rotor to be repetitively rotatably operated by excitation of a coil, it is desired to increase a dynamic torque while ensuring a detent torque and to increase an angle range of rotational operation.\nConventionally, the following two methods have been proposed on the side of a rotor magnet to increase a dynamic torque while ensuring a detent torque and to increase an angle range of rotational operation.\nIn the first"} +{"output_text": " resist profile. The focus exposure matrix is typically generated by a computer program that is designed to optimize the focus and exposure settings for a given resist material. The focus exposure matrix is typically generated by a computer program that is designed to optimize the focus and exposure settings for a given resist material. The focus exposure matrix is typically generated by a computer program that is designed to optimize the focus and exposure settings for a given resist material. The focus exposure matrix is typically generated by a computer program that is designed to optimize", "input_text": "e., a material sensitive to irradiation. The exposed photoresist typically forms a pattern that after development masks the layers of the wafer during subsequent processing steps, as for example deposition and/or etching.\nTwo of the most important process parameters for controlling the photolithographic process are focus and exposure. Focus generally deals with clarity with which an optical subsystem of the lithography system renders an image and exposure generally deals with the amount or dosage of light (or radiation) that is used to form the pattern (such as the light produced by a light source of the lithography system). Both affect the circuit pattern in a non-trivial way. For example, changes in focus and exposure may cause changes in the resist profile, i.e., the shape of the circuit printed in the photoresist. The resist profile is often described by three parameters related to a trapezoidal approximation of the profile: the line width or critical dimension (CD), the sidewall angle and the height. If the resist profile changes are too great, then the final circuit may not run properly or it may not run at all. By way of example, line width is one factor that determines the speed and the timing across the circuit and thus changes thereto may cause one portion of the circuit to run faster or slower than another portion of the circuit (thereby reducing the selling price of the chip since the circuit is clocked to the slower portion). As should be appreciated, the quality of the resist profile is directly related to the quality of the etched or deposited features formed there through. In addition, changes to the resist profile may cause open or shorted circuits such that the circuit may need to be discarded or reworked.\nPresently, the optimal focus and exposure settings of the lithography system are determined using a focus exposure matrix (FEM), i.e., by exposing a wafer with multiple combinations of focus and exposure, and then inspecting the resultant"} +{"output_text": " spare wheel is not provided with a tire. The temporary spare wheel is generally mounted on the vehicle by means of a mounting bracket. The mounting bracket is generally mounted on the vehicle by means of a mounting bolt. The mounting bolt is generally screwed into a mounting hole of the vehicle. The temporary spare wheel is generally mounted on the vehicle by means of a mounting bracket. The mounting bracket is generally mounted on the vehicle by means of a mounting bolt. The mounting bolt is generally screwed into a mounting hole of", "input_text": " cell characteristic\u2014from a cell whose radial orientation is from 10 to 45 degrees away from full radial orientation.\nOf course, other goals and advantages of the inventive technology are revealed in the disclosure provided herein, whether explicitly or implicitly. The present invention relates to a method and a device for adjusting the braking and/or driving effect at the wheels of a motor vehicle.\nA plurality of systems for anti-lock braking control, for traction control, and/or for vehicle stability in motor vehicles are known from the related art. These systems generally start from at least the wheel rotational speeds or the wheel speeds of the vehicle wheels. However, before the wheel speeds are used for regulation, they are generally corrected by a so-called tire tolerance adjustment. Such a regulating system for a motor vehicle is described in German Published Patent Application No. 42 30 295, for example, in which errors in the wheel speeds created by tolerances between the tires are equalized. Such tolerances are due to different wheel diameters, for example. For example, a low-pass filtering in conjunction with a tire tolerance adjustment is described in German Published Patent Application No. 42 30 295.\nFurthermore, a slip regulation system is known, for example, from European Published Patent Application No. 0 510 466, where the wheel rotational speeds are used for slip formation. To equalize the tire tolerances, the wheel speeds are corrected. When determining the appropriate correction factors, possibly existing cornering of the vehicle must be taken into consideration.\nThe variations for adjusting tire tolerances known from the related art generally require relatively long time intervals. If the braking regulation and/or driving regulation is acted upon shortly after the vehicle is started, a slow tire tolerance adjustment can potentially result in unfavorable conditions.\nIf there is tire damage on a motor vehicle, often only a temporary spare wheel or spare wheel is provided. In contrast to standard wheels, this temporary"} +{"output_text": "5 to 10% by weight of Co and 0.5 to 10% by weight of Fe, to a heat treatment at a temperature of 1,000.degree. C. or higher for 1 to 10 hours.\nHowever, the heat treatment at a temperature of 1,000.degree. C. or higher for 1 to 10 hours is not suitable for a TAB tool, which is used in a high temperature environment. In addition, the heat treatment at a temperature of 1,000.", "input_text": ", as compared with that of a small-sized shape.\nFurthermore, another problem than the strength relates to a heat response when this material is used for a pulse heating tool. As apparent from the structure shown in FIG. 3, a pulsating instantaneous heat generation in a shank is propagated through the ceramic substrate and reaches the surface of polycrystalline diamond. Accordingly, the heat response of the tool, determining a mounting cycle, largely depends on the thermal conductivity of the substrate. In the case of the substrate disclosed in Japanese Patent Laid-Open Publication No. 224549/1990, in fact, if its material does not have highly thermal conductivity, the thermal conductivity of the tool is not sufficient and the mounting cycle is thus lengthened by at least two times as long as tools of metals or alloys, as shown in FIG. 2. This is a problem.\nTherefore, it is considered most suitable to use a high strength and high thermal conductivity material as a substrate for an ideal TAB tool, which is coated with polycrystalline diamond by a gaseous phase synthesis method. From this point of view, cemented carbides are considered suitable as a substrate for a TAB tool.\nThe coating technique of polycrystalline diamond onto a cemented carbide substrate has actively been developed for the purpose of mainly aiming at applying to cutting tools and as to the bonding strength of a diamond film having hitherto been considered to be a problem, various improving methods have been proposed. In particular, a surface modifying method comprising subjecting cemented carbides to a heat treatment under special conditions, as disclosed in Japanese Patent Laid-Open Publication No. 330959/1993, is effective for improving the bonding strength. According to this method, the bonding strength between a cemented carbide substrate and diamond coating layer is improved by subjecting a WC-based cemented carbide having a composition comprising, as a binder phase component, 0."} +{"output_text": "ol tris(alkylsulfonate), and pyrogallol tris(arylsulfonate).\nThe resist composition according to the present invention may further comprise a basic compound. Examples of the basic compound include organic bases such as triethanolamine, N-methyl-2-pyrrolidone, N,N-dimethyl-2-pyrrolidone, pyridine, N,N-dimethylaminopyridine, N-methylpiperidine, N-methylmorpholine", "input_text": " substituted or unsubstituted alicyclic hydrocarbon having from 6 to 20 carbon atoms; (k+l)/(k+l+p+q+r) is in the range of 0.01-0.5; p/(k+l+p+q+r) is in the range of 0.1-0.6; q/(k+l+p+q+r) is in the range of 0.1-0.6; and r/(k+l+p+q+r) is in the range of 0.1-0.3.\nIn another embodiment, the present invention provides a resist composition comprising a photosensitive polymer including at least one of the monomers having the respective formulae: \nwhere R7, R8, R9, R10, R11, R12, R13, R14, and R15 are independently a hydrogen atom or alkyl, and z is an integer from 1 to 6.\nIn the resist compositions according to the present invention, the photosensitive polymer has a weight average molecular weight of 3,000 to 100,000.\nThe amount of the photoacid generator (PAG) is in the range of 1 to 30% by weight based on the weight of the photosensitive polymer. It is preferable that the photoacid generator (PAG) comprises triarylsulfonium salts, diaryliodonium salts, sulfonates or a mixture of these materials. More preferably, the photoacid generator (PAG) comprises triphenylsulfonium triflate, triphenylsulfonium antimonate, diphenylionium triflate, diphenyliodonium antimonate, methoxydiphenyliodonium triflate, di-t-butyidiphenyliodonium triflate, 2,6-dinitrobenzyl sulfonates, pyrogall"} +{"output_text": " to raise its temperature to a rubbery state.\nThe present invention overcomes the disadvantages of the prior art by providing an optical storage medium which can be erased by a single laser beam at a wavelength which is absorbed by the expansion layer. The expansion layer is formed of a material which is transparent to the erase beam and which has a high thermal conductivity. The expansion layer is formed of a material which is transparent to the write beam and which has a low thermal conductivity. The expansion layer is formed of", "input_text": " layer rises in temperature above its glass transition temperature so that it can deform to accommodate the bump. The beam is then turned off and the retention layer cools quickly to its glassy state before the bump levels out, thereby fixing the bump. Reading or playback of the data is then achieved by a low intensity \"read\" beam which is focused on the partially reflecting interface between the retention layer and air. When the read beam encounters the bump, some of the reflected light is scattered, while other portions of the reflected light destructively interfere with reflected light from non-bump areas. The resulting drop in intensity is registered by the detector. Removal of the bump to erase the data is achieved by a second laser beam at an \"erase\" wavelength which is absorbed by the retention layer and not the expansion layer. This beam heats the retention layer along to a rubbery state where its viscoelastic forces and those of the expansion layer return it to its original flat configuration. The write, read and erase beams all enter the medium on the retention layer side, passing through retention layer before reaching the expansion layer.\nThe erasable optical storage medium system described in Feyrer et. al., has a number of disadvantages. For example, the writing and erasure of data must be performed at two different wavelengths of light.\nFurther, the device relies on reflection at the interface between the retention layer and air which results in an inherently low reflectivity (30% maximum). Thus the system cannot be read by the detection mechanism of a standard compact disk player designed for focusing through a 1.2 mm polycarbonate substrate and requiring 70% reflectance. Still further there is either a predetermined level of thermal conductivity between the heated expansion layer, to sufficiently raise the temperature of the retention layer so that it can accommodate the bump formed by the expansion layer, or the retention layer must absorb a predetermined amount of light energy"} +{"output_text": " consumption of the data processing means.\nA further known measure is to include additional current profile generators which overlay an additional stochastic current profile on the original current profile of the circuit, for example of the controller, etc. The benefit of this approach is doubtful, since in general it does not provide adequate protection against DPA or HO-DPA and additionally may lead to a significant increase in the current consumption of the data processing means.\nA further known measure is to include additional current profile generators which overlay", "input_text": " from the matrix to decrease as a function of time. These uncertainties restrict the application of the device, and present the rate of release from being known during its use. A similar device is set forth by Levesque in U.S. Pat. No. 2,987,445. 1. Field of the Invention\nThe present invention relates to data storage devices, and in particular to secure operation of such data storage devices.\n2. Description of Related Art\nModern attacks on data processing means processing data relevant in terms of security, and/or attacks on the algorithm and secret keys processed therein are effected via so-called leak information. Leak information includes, for example, current consumption of the data processing unit, electromagnetic radiation during operation of the data processing means, etc. Conclusions about information relevant in terms of security may be drawn from a statistical analysis of the physical signals picked up.\nHere, the most common forms of attack are simple power analysis (SPA), differential power analysis (DPA) or high-order differential power analysis (HO-DPA).\nVarious methods have been employed to prevent these attacks, such as methods related to software engineering which comprise continuously altering the sequence of operations of the cryptographic algorithm or inserting redundant operations. Hereby, statistical evaluations, for example of the power profile or of the electromagnetic radiation, at least prevented or at least made much more difficult.\nA disadvantage of this approach is the large-scale intervention in the respective software and the algorithms as well as the significant reduction in performance which results in most cases. A further known measure is to include additional current profile generators which overlay an additional stochastic current profile on the original current profile of the circuit, for example of the controller, etc. The benefit of this approach is doubtful, since in general it does not provide adequate protection against DPA or HO-DPA and additionally may lead to a significant increase in the current"} +{"output_text": " charge density, which can be between 0.1 and 0.5 meq/g, preferably between 0.2 and 0.4 meq/g.\nThe cationic starches which can be used in accordance with the invention can be obtained by any known technique, in particular by the cationic starch synthesis technique described in the document EP-A-0,788,904.\nThe cationic starches which can be used in accordance with the invention can be obtained by any known technique", "input_text": "cBxe2x80x9d. maize, waxy maize particularly). The starchy materials which can be used in accordance with the invention can also consist of flours or other mixes containing starch(es) and vegetable protein(s), the starch(es) component pre-dominating, and all the products resulting from the chemical and/or physical modification, in one or more stages, of the said flours and the said mixes.\nWhen, in the context of the invention, a chemically modified product is used as the starchy material intended for fluidification, the latter is selected particularly from the group including starches and flours modified by one, at least, of the known techniques of etherification, esterification, sulphonation, oxydation or plastification, in particular cationisation, hydroxyalkylation or acetylation.\nAs a result of which, in accordance with one variant, the starchy materials conversion process in accordance with the invention is characterised in that the starchy material subjected to the chemical fluidification stage is selected from among native starches and flours and from products resulting from etherification, esterification, sulphonation, oxydation and/or plastification, and in particular cationisation, hydroxyalkylation or acetylation, of the said starches and the said flours.\nThe applicant Company has particularly noted that the starchy material subjected to fluidification could to advantage be constituted by a cationic starch.\nSuch products can be prepared by any known technique, conducted in a hydrous medium, in a solvent medium or in dry phase, suitable for enabling one or more electropositive nitrogenous group(s) to settle on the starch. The said nitrogenous groups can particularly contain at lest one tertiary or quaternary nitrogen atom.\nThe cationic starches which can be used in accordance with the invention have a non-restrictive fixed"} +{"output_text": " same as the surface temperature of the heating rotary member in the sheet passing area.\nConventional Example 2: the supply of heat to the heating rotary member is stopped between sheets, and the surface temperature of the heating rotary member in the non-sheet passing area is made lower than the surface temperature of the heating rotary member in the sheet passing area.\nConventional Example 3: the supply of heat to the heating rotary member is stopped between sheets, and the surface temperature of the heating rotary member in", "input_text": ", there is the possibility of faulty fixing occurring, and on the other hand, if the temperature is too high, there is the possibility of the heating rotary member or a member proximate thereto receiving thermal damage. Further, if the temperature of a non-sheet passing portion has become too high as compared with the temperature of a sheet passing portion, the temperature of the end portion of the sheet passing portion becomes too high as compared with a proper fixing temperature, and this gives rise to the fear that hot offset should occur.\nIn recent years, there is a demand for an image forming apparatus which copes with various paper sizes from paper of a relatively large size such as, for example, A3 size to paper of small sizes such as A4R and B5 size usually used. Therefore, it is necessary to construct the axial lengths of the heating rotary member and a pressure rotary member so as to correspond to a relatively large size such as, for example, A3 size. However, in a case where the construction as previously described is adopted, when paper of a small size such as A4R or B5 passes through a fixing apparatus, a non-sheet passing area through which the paper does not pass increases in the effective fixing area of the heating rotary member. When copying is continuously effected on the paper of a small size, heat is not taken away from the surface of the heating rotary member corresponding to the non-sheet passing area by the paper and therefore, the surface temperature of the non-sheet passing area becomes very high.\nIn order to solve the above-noted temperature rise of the non-sheet passing portion, the following propositions have been made.\nConventional Example 1: the supply of heat to the heating rotary member is stopped between sheets, and idle rotation or the like is effected so that the surface temperature of the heating rotary member in the non-sheet passing area may become the"} +{"output_text": "\nA user must also be aware of the location of the mouse pointer on the display screen. The mouse pointer is typically located in the upper left hand corner of the display screen. The mouse pointer is typically located in the upper left hand corner of the display screen. The mouse pointer is typically located in the upper left hand corner of the display screen. The mouse pointer is typically located in the upper left hand corner of the display screen. The mouse pointer is typically located in the upper left hand corner of", "input_text": " controlling or ensuring a specific access routine would be highly desirable. Accessing, and tracking the access of all desired bookmark or hotlist locations is an inefficient process. Management of a daily list of URLs currently must be done manually.\nCurrently, bookmark or hotlist features require the user to click on the menu item entitled \"bookmark\" or \"hotlist\" to display pull down menus containing folders or URLs. To select a bookmark location, the user must traverse the pull down menu with the mouse button depressed and select a menu item in the pull down menu, such as a folder. Next, the folder must be selected and opened, and finally a URL address or bookmark must be selected. Minimal user input would be desirable to efficiently select frequently utilized locations and to provide a user friendly interface.\nCurrently, bookmark or hotlist utilization in browser programs requires opening files and performing multiple steps, such as selecting through a series of menu or sub-menu items to activate a bookmark. With known graphical user interfaces, each time a folder within a sub-menu is selected, which is listed under a menu heading, user precision is required to highlight the menu heading, traverse the newly displayed sub-menu items while keeping the mouse button depressed, and then releasing the mouse button or double clicking the mouse button on the desired selection. A computer operator is required to perform abrupt changes in the motion of the mouse in coordination with a mouse button to select a concealed menu item that resides within a folder. During menu item selection, a user cannot be clumsy or inattentive, because a menu item selection might be made which was not desired.\nA sub-menu item is typically less than quarter of an inch in height on a typical display or monitor. Therefore, substantial dexterity is required to traverse menus and select desired menu items utilizing a pointing device, further coordinated with mouse button activation."} +{"output_text": " stress response can lead to heart rhythm abnormalities, stroke, and heart attack.\nThe health consequences can be serious or even life threatening in those with moderate OSA. The body's response to hypoxia (lack of oxygen) is to increase the amount of blood pumped by the heart to deliver more oxygen to the brain and other organs. This can lead to heart rhythm abnormalities, stroke, and heart attack.\nThe health consequences can be serious or even life threatening in those with mild OSA. The", "input_text": "ension) or traction (sliding) pressure can be used to treat a wide range of conditions, including, for example, cardiovascular, respiratory, muscular, skeletal, lymphatic and skin conditions. In some embodiments, the invention can be adapted for sleep breathing therapy, and more specifically, to provide apparatuses and methods for treating sleep disordered breathing, such as snoring, and obstructive sleep apnea, with hydraulic suction or traction used to open the upper airway.\nThe present invention also describes a novel traction method for externally opening the upper airway by manipulating the anatomic relationships between the skin and soft tissues of the neck and upper chest, and the hyoid bone located in the throat and directly connected to the tongue.\nObstructive sleep apnea (OSA) is a common but under-diagnosed breathing disorder affecting 1 out of 5 adults in the United States. It is caused by airway collapse and blockage when the tongue falls back in the throat and the airway walls compress the upper air passage during sleep. This causes snoring and chronic sleep deprivation that manifests as daytime fatigue, reduced mental ability, driving and work related accidents, and lower overall productivity. The silent health consequences in those with moderate or severe OSA can be serious (high blood pressure, diabetes, acid reflux, blood clots) or even life threatening (miscarriage, stroke, heart attack, or sudden death from heart rhythm abnormalities). Habitual snoring occurs in 44% of men and 28% of women aged 30 to 60 in the United States, and is responsible for significant sleep disruption for bed partners of loud snorers.\nThe health consequences can be serious or even life threatening in those with severe OSA. Low blood and tissue oxygen levels caused by cessation of respiration trigger the release of stress hormones like cortisol and adrenaline. These chemicals cause harmful surges in blood pressure, heart rate and blood sugar. Repetitive cycles of this"} +{"output_text": ", the mark is not straight.\nWhen a card is to be embossed, the card is placed in a card holding member such as a carriage, and the embossing is executed while the card is held in the card holding member. In this case, the card is held in the card holding member by a vacuum suction mechanism. However, when the card is held in the card holding member by the vacuum suction mechanism, the card is held in the card holding member by a suction force", "input_text": ", etc., and a topper process for applying colors to the characters.\nIn the marking process, concaving processes (engraving) or convexing processes (embossing) of a specified shape are executed on the card surface by applying pressure from the front or back of the card at specified positions using a known card marking device. To transport the card to the marking position of the marking device, the card is placed in a card holding member such as a carriage. Embossing or other processes are subsequently executed as shown in FIG. 33(A).\nAnother type of conventional card issuing device has a plurality of card stackers for handling a plurality of cards, and is equipped with a card delivery mechanism for each card stacker. In such a device, the plurality of card delivery mechanisms are not always used simultaneously and therefore constitute duplicate structures. This leads to complex structures, large device sizes, and increased manufacturing cost.\nOn the other hand, in a conventional card issuing device capable of handling a large volume of the same type of cards by moving a plurality of card stackers arranged in a series, a large dead space is created, increasing the size of the device. In addition, even though a plurality of card stackers are provided, the same type of cards are stocked in the plurality of card stackers, and the device cannot handle different types of cards.\nWhen marking surface irregularities of a specified shape on a card, marking is done while the card is inside a card holding member. This marking must to be done parallel to an edge of the card. However, when marking while moving the card, it is difficult to maintain a perfect degree of parallelism between the edge of the card and the moving direction. When marking is done while not maintaining a perfect degree of parallelism, slanted marking results as indicated in FIG. 33(B), and when the mark length is long"} +{"output_text": "xe2x80x2 ends of a polynucleotide species. Examples of suitable enzymes include but are not limited to: exonuclease I, exonuclease III, exonuclease VII, exonuclease VIII, exonuclease X, exonuclease II, exonuclease IV, exonuclease V, exonuclease VI, exonuclease VII, exonuclease IX, exonuclease IIC, exonuclease IID, exonuclease IIC, exonucle", "input_text": "vinylpyrrolidone, etc.).\nIn one aspect, the improved shuffling method includes the addition of at least one additive which is a cationic detergent. Examples of suitable cationic detergents include but are not limited to: cetyltrimethylammonium bromide (CTAB), dodecyltrimethylammonium bromide (DTAB), and tetramethylammonium chloride (TMAC), and the like.\nIn one aspect, the improved shuffling method includes the addition of at least one additive which is a recombinogenic protein that catalyzes or non-catalytically enhances homologous pairing and/or strand exchange in vitro. Examples of suitable recombinogenic proteins include but are not limited to: E. coli recA protein, the T4 uvsX protein, the rec1 protein from Ustilago maydis, other recA family recombinases from other species, single strand binding protein (SSB), ribonucleoprotein A1, and the like. Shuffling can be used to improve one or more properties of a recombinogenic protein; for example, mutant sequences encoding recA can be shuffled and improved heat-stable variants selected by recursive sequence recombination.\nNon-specific (general recombination) recombinases such as Topoisomerase I, Topoisomerase II (Tse et al. (1980) J. Biol. Chem. 255: 5560; Trask et al. (1984) EMBO J. 3: 671, incorporated herein by reference) and the Like can be used to catalyze in vitro recombination reactions to shuffle a plurality of related sequence polynucelotide species by the recursive methods of the invention.\nIn one aspect, the improved shuffling method includes the addition of at least one additive which is an enzyme having an exonuclease activity which is active at removing non-templated nucleotides introduced at 3"} +{"output_text": " 0.5 to about 5.0 atmospheres, more preferably from about 0.5 to about 3.0 atmospheres, and most preferably from about 0.5 to about 1.5 atmospheres.\nThe process of the present invention can be carried out in a variety of ways. For example, the process can be carried out in a single reaction zone or in a series of reaction zones. In a single reaction zone, the sulfur-containing fluid is contacted with the sorbent composition in", "input_text": " of sulfates associated with the sorbent composition. It has been discovered that the presence of the promoter metal in the sorbent composition facilitates a reduction in the amount of sulfates associated with the sorbent composition when the sulfated sorbent composition is contacted with the reducing stream in the activation zone. Thus, the amount of sulfates removed from a sorbent in the activation zone when the sorbent comprises the promoter metal is more than the amount of sulfates removed from the sorbent composition when the sorbent comprises substantially no promoter metal. Preferably, when the sorbent comprises the promoter metal, at least about a 2 percent increase in sulfate removal (by weight of sulfur as sulfates) is exhibited over a sorbent comprising substantially no promoter metal, more preferably at least about a 5 percent increase in sulfate removal is exhibited, still more preferably at least about a 10 percent increase in sulfate removal is exhibited, and most preferably at least a 50 percent increase in sulfate removal is exhibited over a sorbent comprising substantially no promoter metal.\nOnce the sorbent has been activated in the activation zone, at least a portion of the activated sorbent can be returned to the desulfurization zone for desulfurization or further desulfurization of the sulfur-containing fluid.\nIn carrying out the process of the present invention, a stripper zone can optionally be inserted before and/or after, preferably before, regenerating the sulfurized sorbent composition in the regeneration zone. A similar stripper zone, preferably utilizing a stripping agent, serves to remove a portion, preferably all, of any hydrocarbon(s) from the sulfurized sorbent composition. Such stripper zone can also serve to remove oxygen and sulfur dioxide from the system prior to introduction of the regenerated sorbent composition into the activation zone. Preferably, the stripping, when employed, is carried out at a total pressure in a range of from about"} +{"output_text": " or in combination.\nThe amount of the plasticizer used is preferably about 0.1 to about 10 parts by weight per 100 parts by weight of the adhesive resin. When the amount of the plasticizer is less than about 0.1 part by weight, the plasticizer may not be sufficiently dispersed in the adhesive resin, and when the amount of the plasticizer is more than about 10 parts by weight, the adhesive resin may be deteriorated in the heat resistance.\nThe present invention is described in", "input_text": " oxide, and about 0.5 or less part by-weight per 100 parts by weight of the above-mentioned adhesive resin.\nWhen the addition amount of carboxyl modified silicone oil or amine modified silicone oil is less than about 0.5 part by weight per 100 parts by weight of tin-doped indium oxide and/or antimony-doped tin oxide, enough dispersing effect may not be obtained. On the other hand, carboxyl modified silicone oil or amine modified silicone oil is added over about 0.5 part by weight per 100 parts by weight of adhesive resin, the bond strenght between the obtained interlayer film and the glasses may vary with a lapse of time.\nAs the dispersant of the present invention, it can be used by combining the above-mentioned (a)a chelating agent, (b)a compound having one or more carboxyl groups at its terminal position, or (c)a modified silicone oil, together with any other dispersants. As the other dispersants, there can be mentioned the dispersants generally used as dispersants of inorganic fine particles, for example, phosphate compounds such as phosphate or polyphospate, and so on, sulfate compounds such as organic sulfate, and so on, polyalcohols surfactants such as polycarboxylate, polyol ester, and so on.\nIn the present invention, one of the preferable embodiments is plasticizing an adhesive resin by a plasticizer.\nThe plasticizer used in the present invention is not specifically limited, and any per se known plasticizer generally used for an interlayer film can be used, but preferably used are, for example, organic plasticizers such as monobasic acid ester, polybasic acid ester, and so on, phosphoric acid plasticizers such as organic phosphoric acid, organic phosphorous acid, and so on.\nThese plasticizers can be used singly"} +{"output_text": "phase flow measurement for each of the individual wells.\nIn accordance with another embodiment of the present invention, a method of allocating fluid production from a plurality of individual hydrocarbon wells is provided. The method comprises the steps of: (a) simultaneously measuring a plurality of individual well multi-phase flow measurements, each indicative of the fluid produced by a respective one of the individual wells; (b) measuring a combined multi-phase flow measurement of the combined fluid cooperatively produced by all of the individual wells", "input_text": " minimum of two well tests per month on each well, and additional well tests within 24 hours of any significant operational change of the well is required. Further, a full time surveillance engineer is required to validate the well testing and production allocation process. Using this method for custody transfer requires a significant capital expenditure to standardize all well testing equipment.\nResponsive to these and other problems, an object of the present invention is to provide a system for measuring and monitoring individual well production without requiring time-consuming, periodic, manual, individual well tests.\nA further object of the present invention is to provide a well production measuring and monitoring system which continuously monitors the production of each individual well.\nA still further object of the present invention is to provide an individual well production measuring and monitoring system which allows for the elimination of costly test headers.\nAn even further object of the present invention is to provide a system for more accurately measuring the production of a plurality of individual wells operating as part of a common production field.\nIt should be noted that the above-listed objects need not all be accomplished by the invention claimed herein, and other objects and advantages of this invention will be apparent from the following description of the invention and appended claims.\nIn accordance with one embodiment of the present invention, a method of determining fluid production from a plurality of individual hydrocarbon wells is provided. The method comprises the steps of: (a) simultaneously measuring a plurality of individual well multi-phase flow measurements, each indicative of the fluid produced by a respective one of the individual wells; (b) measuring a combined multi-phase flow measurement of the combined fluid cooperatively produced by all of the individual wells; and (c) allocating the combined multi-phase flow measurement from step (b) to each of the individual wells based on the individual well multi-phase flow measurements from step (a) to thereby determine an adjusted individual well multi-"} +{"output_text": "ptides.\nThe invention also provides the use of polynucleotide shuffling to shuffle a population of protein variants, such as taxonomically-related, structurally-related, and/or functionally-related enzymes and/or mutated variants thereof to create and identify advantageous novel polypeptides.\nThe invention also provides the use of polynucleotide shuffling to shuffle a population of protein variants, such as taxonomically-related, structurally-related, and/or functionally-related enzymes and/or", "input_text": " polymerase under conditions which provide for the annealing of the single-stranded fragments at the areas of identity and the formation of a chimeric double-stranded polynucleotide sequence comprising template polynucleotide sequences; and repeating the above steps as desired.\nA fourth aspect of the present invention is directed to a method of replicating a template polynucleotide by combining in vitro single-stranded template polynucleotides with small random single-stranded fragments resulting from the cleavage and denaturation of the template polynucleotide, and incubating said mixture of nucleic acid fragments in the presence of a nucleic acid polymerase under conditions wherein a population of double-stranded template polynucleotides is formed.\nThe invention also provides the use of polynucleotide shuffling, in vitro and/or in vivo to shuffle polynucleotides encoding polypeptides and/or polynucleotides comprising transcriptional regulatory sequences.\nThe invention also provides the use of polynucleotide shuffling to shuffle a population of viral genes (e.g., capsid proteins, spike glycoproteins, polymerases, proteases, etc.) or viral genomes (e.g., paramyxoviridae, orthomyxoviridae, herpesviruses, retroviruses, reoviruses, rhinoviruses, etc.). In an embodiment, the invention provides a method for shuffling sequences encoding all or portions of immunogenic viral proteins to generate novel combinations of epitopes as well as novel epitopes created by recombination; such shuffled viral proteins may comprise epitopes or combinations of epitopes which are likely to arise in the natural environment as a consequence of viral evolution (e.g., such as recombination of influenza virus strains).\nThe invention also provides the use of polynucleotide shuffling to shuffle a population of protein variants, such as taxonomically-related, structurally-related, and/or functionally-related enzymes and/or mutated variants thereof to create and identify advantageous novel polype"} +{"output_text": " contacts on the device under test. This is because the support element 54 is mounted on the upper side of the probe card 52 so that the contacts on the central region 80 of the flexible membrane assembly are presented in a suitable position for pressing engagement with the pads of the device under test. The support element is mounted on the upper side of the probe card by means of a plurality of alignment pins 74 that are inserted through holes in the probe card and into holes in the support element. The alignment pins are", "input_text": " a rearward base portion 64 to which the attachment arms 60 are integrally joined. Also included on the support element 54 is a forward support or plunger 66 that projects outwardly from the flat base portion This forward support has angled sides 68 that converge toward a flat support surface 70 so as to give the forward support the shape of a truncated pyramid. Referring also to FIG. 2, a flexible membrane assembly 72 is attached to the support after being aligned by means of alignment pins 74 included on the base portion. This flexible membrane assembly is formed by one or more plies of insulative sheeting such as KAPTON(trademark) sold by E.I. Du Pont de Nemours or other polyimide film, and flexible conductive layers or strips are provided between or on these plies to form the data/signal lines 76.\nWhen the support element 54 is mounted on the upper side of the probe card 52 as shown in FIG. 3, the forward support 66 protrudes through a central opening 78 in the probe card so as to present the contacts which are arranged on a central region 80 of the flexible membrane assembly in suitable position for pressing engagement with the pads of the device under test. Referring to FIG. 2, the membrane assembly includes radially extending arm segments 82 that are separated by inwardly curving edges 84 that give the assembly the shape of a formee cross, and these segments extend in an inclined manner along the angled sides 68 thereby clearing any upright components surrounding the pads. A series of contact pads 86 terminate the data/signal lines 76 so that when the support element is mounted, these pads electrically engage corresponding termination pads provided on the upper side of the probe card so that the data/signal lines 48 on the probe card are electrically connected to the contacts on the central region.\nA feature of the probing assembly 42 is its capability for probing a somewhat dense arrangement of"} +{"output_text": " of the television program.\nThe device according to the invention may be used in a television receiver, a video recorder, a video cassette recorder, a video disk recorder, a video tape recorder, a video camera, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video camera recorder, a video", "input_text": " a device having a more user-friendly facility of rapidly skipping the commercials in a television program.\nTo this end, the device according to the invention is characterized in that the buffer means are adapted to receive television images at a first frame frequency and to supply television images at a second frame frequency, a quantity X being defined as being X equal to the number of consecutive television images, stored in the buffer means, between an image read from the buffer means at a given instant and an image written at substantially the same instant, in that X increases with time in the first state and decreases with time in the second state, and in that the device comprises control means for generating a first control signal for bringing the device from the first to the second state, and a second control signal for bringing the device from the second to the first state.\nThe invention provides a device in which it is no longer necessary to store a number of minutes of the television program corresponding to the period of time of the commercials to be expected in the television program into the buffer means and subsequently start watching the television program. The invention provides the possibility of watching these commercials simultaneously with the reception of the beginning of the television program. To be able to skip the expected commercials rapidly, a slightly larger number of television images are stored in the buffer means per unit of time in a first state as compared with the number of television images read, so that at the instant when a block of commercials in a program is displayed and watched by the user, this block is approximately entirely stored in the buffer means. The user can now switch the device to a second state so that the block of commercials is skipped in an accelerated way. In order that the user can continue watching the program, he will then switch the device to the first state again.\nThe device according to the invention has the extra advantage that the buffer means may be smaller because it need only be approximately the size"} +{"output_text": " significant lesions.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a high-density interconnect structure.\nIn recent years, the integration density of semiconductor devices has been increased, and the number of interconnections has been increased. In order to increase the integration density of semiconductor devices, it is necessary to reduce the size of the interconnections. However, the reduction of the size of the interconnections causes a problem that the", "input_text": " and necrotic foam cells, as well as OxLDL, secrete factors that weaken the plaque. Human pathology studies have shown that atheromas containing a large necrotic core, thin fibrous cap and large numbers of macrophage/foam cells in the shoulder are more predisposed to plaque rupture and thrombosis. These lesions, which frequently appear as mild or moderate coronary stenoses in angiographic studies, are characterized pathologically as large atheroma with extensive lipid pools exceeding 40% of plaque areas. Angiography only provides a measure of arteial lumen, but fails to detect vessel wall pathology. Diagnostic methods that provide a measure of the overall extent of the atherosclerotic lesion, with an emphasis on OxLDL and lipid content, would therefore be desirable. Moreover, the lipid core of atheromas can be assumed to contain extensive oxidized lipids that accumulated within foam cells and set free when cells undergo necrosis and apoptosis.\nNon-invasive detection of atherosclerotic lesions is currently not clinically feasable. The gold standard for diagnosing atherosclerosis is angiography which detects abnormal vessel lumen contours caused by encroaching atherosclerosis but does not directly identify abnormalities of the vessel wall. The widely recognized limitations of angiography include poor correlation with functional stenosis, interobserver and intraobserver variability, underestimation of the extent of disease because of diffusely atherosclerotic vessels, and arterial remodeling. B mode and ultravascular ultrasonography can detect intima/media thickening and calcification of vascular walls, but cannot clearly assess specific tissue characteristics. Electron beam computed tomography detects only calcium in vessel walls. Magnetic resonance imaging is still an investigational tool for the detection of plaque components.\nHuman studies have suggested that plaque rupture frequently occurs in nonangiographically significant lesions that contain abundant lipid-laden macrophages and large lipid pools within atheromas. Therefore imaging of atherosclerosis directed at lipid rich areas would be of value, not only in detecting the extent of lesion burden, but also in the detecting clinically"} +{"output_text": " electronically generated or stored originals, where a charged surface may be imagewise discharged in a variety of ways. Ion projection devices where a charge is imagewise deposited on a charge retentive substrate operate similarly.\nThe foregoing broadly describes a typical electrophotographic machine. In such machines, the photoreceptor is usually charged to a uniform potential and then selectively discharged (or charged) in accordance with a light image of the original to be reproduced. Electrophotographic machines may use a variety of techniques to", "input_text": " comparing the power supply voltages on the power supply connections with an associated threshold voltage and ignoring externally provided commands while the power supply voltages are below a predetermined threshold voltage.\nAnother method of operating a memory having a plurality of power supply connections is described. The method comprises monitoring power supply voltages on power supply connections, comparing the power supply voltages on the power supply connections with an associated threshold voltage and prohibiting initialization of memory while power supply voltages are below predetermined threshold voltages. 1. Field of the Invention\nThis invention relates to methods and apparatus for predicting the cycle-down characteristics of the photoreceptor. More particularly, through the use of historical values and actual measured values, the characteristics of the decaying charge potential on the photoreceptor can be predicted for the next cycle.\n2. Description of the Related Art\nIn electrophotographic applications such as xerography, a charge retentive surface is electrostatically charged. A photoreceptor belt has a typical charge retentive surface. A light pattern formed from the original image to be reproduced selectively discharges the charge on the photoreceptor. The resulting pattern, a combination of charged and discharged areas on the photoreceptor, form an electrostatic charge pattern (an electrostatic latent image) conforming to the original image. The latent image is developed by contacting it with a finely divided electrostatically attractable powder referred to as \"toner\". Toner is held on the image area by the electrostatic charge on the surface. Thus, a toner image is produced in conformity with a light image of the original being produced. The toner image may then be transferred to a substrate (e.g., paper), and the toner is fused onto the substrate by passing through a fuser. At this point the image is affixed to the substrate and is ejected from the machine to the holding tray. The process is well known, and is useful for light lens copying from an original, and printing applications from"} +{"output_text": " for the system.\nThe present invention is directed to a drainage system for use in a patient's pleural cavity. The system includes a drainage chamber having a first end and a second end. The first end is adapted to be connected to a drainage tube. The second end is adapted to be connected to a suction tube. The system also includes a first valve disposed between the first end and the second end of the drainage chamber. The first valve is adapted to be opened and closed by a first valve actuator", "input_text": " also cause damage to the blood components.\nOther prior art drainage systems are described in my co-pending application Ser. No. 801,205, filed Nov. 25, 1985, which is incorporated herein by reference. The drainage systems, described therein, which obviate the above problems by removing the liquid (water) seal from the drainage chamber to the other, e.g., suction control, chambers suffer from the disadvantage of being cumbersome. Furthermore, their use is position dependent in that any changes in height of the liquid, e.g., as a result of tilting in the suction and/or water seal chambers of, e.g., the Deknatel(.TM.) Pleur-evac(.TM.) drainage systems (Deknatel division of Howmedica, Inc., Floral Park, N.Y.) will effect changes in the suction applied to the patient. In addition, changes in water level as a result of evaporation or entrainment in the evacuated gases will also affect accurate control of pressures within the system. To overcome those problems constant monitoring of, or periodic additions of water to maintain the liquid levels of the water seal containing chambers is required with their concomitant potential for operator errors.\nThe drainage system described in said co-pending application Ser. No. 801,205 avoids those problems by removing all liquid seals and uses purely mechanical means, such as flapper valves on the drainage inlet tube and excess positive or negative pressure relief valves. Therefore, although liquids may be present in the chamber after drainage has been initiated the applied suction cannot be affected by changes in the liquid level therein and the patient is protected from pnemothorax upon attachment to the device. Furthermore, said invention also avoids the cumbersome aspects of the prior art devices by providing a single easily detachable drainage collection chamber to which is removably and sealably affixed a cap comprising all of the controls"} +{"output_text": "+impurity region 13 acting as the gate.\nThe horizontal charge transfer section 2 includes a horizontal charge transfer channel 15 formed on the main surface of the p-type silicon substrate 6. The horizontal charge transfer channel 15 is formed between the photoelectric conversion regions 4 and the vertical charge transfer regions 5. The horizontal charge transfer channel 15 is formed by a plurality of n-type impurity regions 16 formed on the main surface of the p-type silicon substrate 6.\nThe horizontal charge transfer section 2", "input_text": " having the illustrated construction is disclosed in Japanese Patent Laying-Open No. 59-105779, for example.\nReferring to these drawings, the interline transfer type solid-state image sensing device comprises a photosensitive, vertical charge transfer section 1, a horizontal charge transfer section 2, and overflow drain sections 3. The photosensitive, vertical transfer section 1 includes photoelectric conversion regions 4 and vertical charge transfer regions 5. The photoelectric conversion regions 4 include a plurality of n-type impurity regions 7 arranged in a matrix form on a main surface of a p-type silicon substrate 6.\nThe vertical charge transfer regions 5 include channel regions 8, insulating films 9 and transfer electrodes 10. The channel regions 8 comprise n-type impurity regions formed on the main surface of the p-type silicon substrate 6. The transfer electrodes 10 comprise a plurality of polysilicon conductive layers aligned in a direction of charge transfer.\nA p+ impurity region 11 is formed between each n-type impurity region 7 of the photoelectric conversion regions 4 and each channel region 8 of the vertical charge transfer regions 5. The transfer electrodes 10 partly overlie the p+impurity regions 11 with the insulating films 9 in between. The p+impurity regions 11, extensions of the transfer electrodes 10 and insulating layers 9 constitute read gates 12.\nThe overflow drain sections 3 are opposed to the vertical charge transfer regions 5 across the photoelectric conversion regions 4. The overflow drain sections 3 include p+impurity regions 13 and n-type impurity regions 14 formed on the main surface of the p-type silicon substrate 6, and gate electrodes 14 formed on the insulating films 9. Each overflow drain section 3 has a construction of a MOS (metal oxide semiconductor) transistor having the n-type impurity region 7 of the photoelectric conversion region 4 acting as the source, the n-type impurity region 14 acting as the drain, and the p"} +{"output_text": " be cleaned and inspected in about 30 minutes.\nIn-line inspection and measurement systems are typically used to detect and characterize contamination and residues on wafers. These systems typically include a light source, such as a laser, that illuminates the wafer surface and a detector that detects the light reflected from the wafer surface. The reflected light is analyzed to determine the presence of contamination and residues on the wafer surface.\nIn-line inspection and measurement systems are typically used to detect and characterize contamination and residues on wafers", "input_text": " such as but not limited to the semiconductor industry. Each step is time consuming and requires expensive chemicals that may require special disposal procedures. Existing methods for monitoring or controlling these processes are expensive and time consuming. As a result, wafers are often cleaned for a longer period of time, using more chemicals than are required.\nContinuing with the example of the semiconductor wafer industry, contamination and chemical residue are two major causes of reduced yield in semiconductor fabrication. Wafer cleanliness is becoming even more important as minimum feature sizes shrink below 90 nm, where the thickness of adsorbed layers and organic contaminants are on the same order as the process tolerances of functional films in devices. Contamination, whether organic or metallic, may generate process variation and defects such as poor coverage, vacancies, voids, leaking, shorting, and overburdens. For example, a small amount of metal contamination on the wafer surface may diffuse into the bulk semiconductor and cause the bulk minority carrier lifetime to decrease because the metal contamination may assist in the recombination of electrons and holes in the semiconductor substrate. Reducing contamination and minimizing residues are important factors in increasing yields in semiconductor wafer fabs.\nDefect detection and characterization systems, such as in the semiconductor wafer industry, can usually be divided into in-line and off-line systems. \u201cIn-line\u201d refers to inspection and measurement that takes place inside the clean room where wafers are processed. \u201cOff-line\u201d refers to analysis that takes place outside of the wafer processing clean room, often in a laboratory or separate clean room that is located some distance from the manufacturing area. In addition, many of these analytical techniques are destructive, which requires either the sacrifice of a production wafer or the use of expensive \u201cmonitor\u201d wafers for analysis. In-line inspection and measurement is crucial for rapidly identifying and correcting problems that may occur periodically in these types of manufacturing processes. A typical semiconductor wafer can"} +{"output_text": " production is related to increased acetylcholine release from cholinergic nerves.\nThe role of neuropeptides in sebum production is also unclear. Neuropeptides are known to be present in the skin and to have a variety of effects on the skin. For example, neuropeptides are known to be present in the skin of the eyelids and to have a variety of effects on the skin.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for", "input_text": " odor of sweat.\nHolocrine glands are fundamentally different from apocrine and eccrine glands. The secretion is primarily lipid rather than water. Moreover the lipid secretion is not secreted from cells; instead the cell, called acebocyte, accumulates large amounts of the secretion and then dies, releasing the lipid material together with cellular remnants.\nThe vast majority of holocrine glands are sebaceous glands that produce a lipid secretion called sebum. Sebaceous glands usually have several acini that open into a short duct. The sebum producing cells are present in the acini and in the wall of the duct. Most sebaceous glands are called pilosebaceous glands because they secrete into a duct that normally opens into the upper part of a hair follicle. However, in certain areas of the body such as the lips the ducts open directly onto the skin's surface. A variety of other holocrine organs are present in the skin of the eyelids: meibomium glands, and glands of Zeiss and Moll.\nThe control of holocrine glands has long been known to involve systemic hormones, particularly the male sex hormones called androgens. Androgens increase during puberty in both males and females. Supporting the connection between hormones and sebaceous gland function is that sebum production increases after puberty and its peak incidence is from ages 12 to 22. Increased sebum production is also related to pregnancy, pre-menstrual period and to birth control medication.\nThe role of classical neurotransmitters such in sebum production is unclear. Anticholinergics appear to have little effect on sebum production. However, pilocarpine, a cholinergic agonist, increases sebum production when iontophoresed across the skin (Yosipovitch et al, Br J Dermatology, 1995: 561-4). Evidence suggests that increased sebum"} +{"output_text": " unselected memory cell P2.\nIn the above-described conventional semiconductor memory device, the select gate 47 is formed of a polycrystalline silicon film, and the select gate 47 is connected to the control gate 45 through a contact hole formed in the first insulating film 42. The select gate 47 is formed of a polycrystalline silicon film having a high resistance, and the select gate 47 is connected to the control gate 45 through a contact hole formed in the first insulating film 42.\nIn the above", "input_text": " gate 45, drain lines 48 and the source lines 49 are arranged in alternating columns, each being parallel with the control gates (CG) as seen in FIG. 17.\nIn this semiconductor memory device, each semiconductor memory element has a floating-gate electrode 44 formed on first insulating thin film (a gate oxidation film) 42 on a semiconductor substrate 41, and a line-shaped control gate electrode 45 covering the floating-gate electrode 44 through a second insulating film 50 (an ONO laminated film). A line-shaped select gate 47 extends over the top and side surfaces of the stacked gate structure which is composed of floating gate 44 and control gate 45 through insulating films, such as films 51 and 50, and over a part of the substrate 41 through the first insulating thin film 42 (a gate oxidation film) on the substrate 41. The stacked gate structure is arranged perpendicular to the select gate 47. Line-shaped substrate diffusion regions (e.g., the source line 49 and the drain line 48) are alternately arranged parallel with the control gate, wherein one of the diffusion regions (the source 49) is offset from the control gate 45 (or the stacked gate structure) so that it is possible to achieve a matrix selection of respective semiconductor memory element regions using the control gate 45 and the select gate 47.\nFIG. 18 shows an equivalent circuit of the semiconductor memory device (memory cell array) shown in FIGS. 16 and 17. To select memory cell P1 for a storage operation, a voltage of 5 volts is applied to drain line D1 and source line S2, a voltage of 12 volts is applied to control gates CG1 and CG2, and a voltage of about 2 volts is applied to select gate SG1. Other lines are kept at a grounding potential. The reason for applying 5 volts to the source line S2 is to inhibit a storage operation in an"} +{"output_text": " of the impact is important for determining the location of the airbag deployment. In rear impacts, the location of the impact is important for determining the location of the airbag deployment. In side impacts, the location of the impact is important for determining the location of the airbag deployment. In rear impacts, the location of the impact is important for determining the location of the airbag deployment. In side impacts, the location of the impact is important for determining the location of the airbag deployment. In", "input_text": " applied upon stacking of batteries to manufacture a battery module, with the result that it is difficult to use various heat dissipation members, such as a water cooling type cooling plate or a non plate-shaped cooling member.\nTherefore, there is a high necessity for technology that is capable of fundamentally solving the above-mentioned problems. 1. Field of the Invention\nThis invention relates to sensors for measuring the location and the width of an impacted object.\n2. Background Art\nSensors and sensing systems for active restraints on vehicles generally include accelerometers, speed sensors, piezoelectric sensors, flex tape switches, ribbon switches and the like. Such prior art sensors and systems do not have the capability of determining the width of an object struck by the sensor on the vehicle. The systems also fail to provide a mechanism for determining the location of an impact on the sensor. Ribbon switches or tape switches such as those disclosed in U.S. Pat. Nos. 3,694,600; 5,847,643; or 6,009,970 produce an output indicating that the strip switch or ribbon switch has been contacted but fail to provide any indication as to the width of the area contacted or the location of the area contacted on the elongated switches.\nInformation regarding the size and location of a collision would be useful for pedestrian protection systems or airbag deployment systems. If a pedestrian is struck by a vehicle, the greatest risk of injury is the risk of injury to the pedestrian. If, on the other hand, a pole, bridge abutment or another vehicle is contacted by a vehicle, the principal risk of injury is to the driver and passengers of the vehicle. In such instances, impact sensors are used to activate interior active restraint systems such as dashboard airbags or side curtains.\nThere is a need for sensors that can sense the location and size of an object struck by a vehicle. In side or front impacts, the location"} +{"output_text": " a system that will produce the maximum amount of ozone in the shortest amount of time.\nThe present invention is directed to a method and apparatus for generating ozone. The apparatus includes a housing having a first end and a second end. The housing defines a chamber having a first end and a second end. The first end of the chamber is in fluid communication with the first end of the housing. The second end of the chamber is in fluid communication with the second end of the housing. The apparatus also includes", "input_text": ". The air pump preferably includes a microbial filter to filter contaminants. A diffuser can be used to diffuse the generated ozone into the water reservoir.\nVarious factors impact the effectiveness of bacterial removal from the water such as the microbial load, pH, temperature, conductivity, and cooler characteristics (e.g., whether an ice ring has formed which can act as a shield for microbes trapped in the ice ring). Furthermore, the variability of power supply (e.g., European power supplies versus US power supplies) can cause a generator's application to be geographically limited unless modified. Additionally, time constraints for operation of the ozone generator and diffuser can impact operation.\nAdditionally, in certain refrigerated reservoirs an ice ring can form inside the reservoir adjacent to the cooling coils for the reservoir. Such an ice ring can serve as a form of protection for microbes contained in the ice ring when ozone is being diffused in the reservoir. After an ozone cycle, when the ice melts wholly or partially, the trapped microbes can enter the water and thus contaminate the reservoir.\nAdditionally, certain waters contain loadings of bromates which can cause problems.\nThe above indicate a need for developing a generator and diffuser containing flexibility regarding the timing, amount, and duration of ozone generated; along with the timing, amount, and duration of air supplied. Additionally, there is a need for killing microbes which may be trapped in ice rings. Furthermore, there is a need for addressing water containing bromates. Additionally, there is a need for addressing different types of electrical supplies for various geographical areas.\nIn a preferred embodiment the method and apparatus is directed to an economical means of overcoming each of the factors that limit process ozone's potential disinfecting capacity. It is concerned with the optimization of each point in small automated ozonation systems both upstream and downstream from the ozonator. The object of this effort is to devise"} +{"output_text": " GPS receiver, a processor, a display, a keypad, a battery, and a radio. The GPS antenna is typically a hand-held unit which is used to receive GPS signals from the GPS satellites. The GPS receiver is typically a separate unit which is used to process the GPS signals and to determine the location of the GPS unit. The processor is typically a separate unit which is used to process the GPS signals and to determine the location of the GPS unit. The display is typically a separate unit", "input_text": " positioning system (DGPS) network includes a receiver which receives ephemerides data from satellites. Typically, such data is received from global positioning system (GPS) satellites which are a part of the GPS satellite network or satellites which are a part of the Global Navigation Satellite System (GLONASS). The ephemerides data is processed via an electronics package located within the GPS unit. The GPS unit receives differential correction data through a separate radio which is typically connected to the GPS unit by cable. The differential correction data is typically obtained from a radio coupled to a GPS unit which is located at a fixed site which is placed at a known location or it is obtained from other sources and is conveyed via radio. By processing the differential correction data together with the data received at the particular GPS receiver, the location of the GPS unit may be determined within a high degree of accuracy. This same method may be used to perform real time kinematic (RTK) surveys so as to accurately determine the relative position of the GPS system with sub centimeter accuracy.\nPrior art GPS devices used in DGPS applications and RTK applications typically require numerous separate, distinct component units which are connected via cables. For example, the GPS receiver and processor would constitute one unit and the terrestrial radio would constitute a second unit which would be coupled to the GPS processor via cable. Typically, an input/output (I/O) unit which includes a display for data monitoring and a keypad for data input is also required. The I/O unit is coupled to the GPS receiver/processor unit and to the terrestrial radio via cable. Some systems also require the attachment of a separate battery via cable. Because multiple separate units are used in these prior art systems, the systems are bulky and they are difficult to move around.\nFor example, one type of prior art system which is typically referred to as \"handheld\" includes a GPS antenna, a"} +{"output_text": " to spatial diversity. Polarization diversity uses two or more antennas with different polarization states. The system performance is generally limited by the cross-correlation coefficient between the two antennas with different polarization states. The optimum performance occurs when the cross-correlation coefficient approaches zero.\nThe present invention provides a method of improving the performance of a wireless communication system. The method includes the steps of receiving a first signal from a first antenna, receiving a second signal from a second antenna, and combining the first signal and the second", "input_text": "opathol. 16(4):1005-12 (October 2001). Bcl-2 has also been shown to play a role in prostatic cancers, and has been specifically linked to aggressive tumors common in specific racial groups. Shi et al., Cancer Biother. Radiopharm., 16(5):421-9 (October 2001); Slothower, Study Suggests Bcl-2 Gene as a Cause for Aggressive Prostate Cancer in African American Men, U.C. Davis Med. Ctr., (May 1998). As briefly noted above, a study in which Bcl-2 production was down-regulated in prostate cancer cells showed inhibition of cell growth and increased sensitivity to treatments designed to induce apoptosis. Shi et al., Biother. Radiopharm., 16(5):421-9 (October 2001).\nAccordingly, it would be an advancement in the art to provide inhibitory oligonucleotide compounds configured to bind to, and consequently, to modulate the activity of nucleic acids encoding proteins which play a role in diseases such as cancer. It would be a further advancement to provide effective target sites for Bcl-2 gene regulation. It would be an advancement in the art to provide oligonucleotides complementary to effective antisense target regions of a nucleic acid encoding Bcl-2. It would be a further benefit in the art to provide compositions such as medications, including such oligonucleotides. Finally, it would be an improvement in the art to provide methods of using such oligonucleotides and compositions.\nSuch oligonucleotides, target regions, compositions including such oligonucleotides, and methods of their use are disclosed herein. Diversity techniques are widely used in wireless communications to improve the signal performance. Spatial diversity typically uses two or more antennas spatially separated. The system performance is generally limited by the cross-correlation coefficient between the two spatially diversified antennas. The optimum performance occurs only when the cross-correlation coefficient approaches zero.\nPolarization diversity provides an alternative"} +{"output_text": " the connection between F1 and F2 that was in the original T and was preserved by the process engineer when creating T\u2033 is marked as such, because the corresponding connection in the EPC has been added by the business user when going from S to S\u2032.\nIn the example above, the process engineer has to manually add the element t to T\u2033. This is a time-consuming and error-prone process. In addition, the process engineer may have to manually add the element t", "input_text": " with \u201cadded by transformation.\u201d This might seem non-intuitive, as the corresponding connection has not been changed from S to S\u2032. However, the corresponding connection in T\u2033 had been deleted when F2b was inserted by the process engineer. In an ideal scenario, the actual merge state of this connection might have to be \u201cdeleted in target.\u201d The merge, however, may not be able to distinguish between these two situations. To be able to correctly flag this connection as \u201cdeleted in target,\u201d it may be necessary to maintain the complete change history of T\u2032, which could be difficult or impossible or simply not worth the tradeoffs between size, space, complexity, possible processing gains, etc.\nFor elements that are found in T\u2033 and have no counterpart in T\u2032, two situations can be differentiated. If such an element t was created by the (original) transformation (that can be determined based on the presence of trace information on t), this may indicates that the source model S\u2032 has changed in a way such that an object corresponding to t is no longer created by the current transformation. Consequently, it may be added to the conflict model and flagged with the merge state \u201cdeleted by transformation,\u201d e.g., as indicated by a red minus icon. In this example, the connection between F1 and F2 that was in the original T and was preserved by the process engineer when creating T\u2033 is marked as such, because the connection no longer exists in T\u2032, since the corresponding connection in the EPC has been deleted by the business user when going from S to S\u2032.\nIf an element t was manually added to T\u2033, which can be recognized by the absence of trace information, it also may be added to the conflict model and flagged with the merge state \u201cadded in target,\u201d indicated by a blue plus icon. In this example, the function F2b and"} +{"output_text": " provide a fuel cell stack that is capable of being assembled with tight dimensional tolerances, and which is also capable of being disassembled for subsequent BOP connection and reassembly.\nIn addition to the aforementioned dimensional variances, the fuel cell stack is also subject to dimensional variances due to thermal expansion and contraction. For example, the fuel cell stack may be subjected to temperature variations of up to about 100\u00b0 C. during operation. Such temperature variations may cause the fuel cell stack to expand or contract by as", "input_text": " either case, the enclosed fuel cell stack is then mechanically and electrically secured to the vehicle or related device.\nIn the aforementioned design where the housing is defined by a series of side panels and end plates, it also typically includes interconnecting tie rods or bracketing elements to bind these discrete components, as well as to maintain a compressive force on the stacked fuel cells. The end plates are then compressed together by the brackets or tie rods that are mounted along the surface of one or more of the side panels. Compressive force is retained by securing the tie rods with bolts or related fasteners that extend normal to the generally planar surface of the side panels such that the bolts are loaded in shear.\nIntegration of fuel cell stacks into automotive platforms is a demanding challenge to the fuel cell system designer, necessitating precise placement and alignment with balance of plant (BOP) equipment that is situated inside of the vehicle's fuel cell system-receiving compartment. In the present context, BOP refers to components present in the vehicle, including but not limited to blowers, pumps, hoses, compressors or the like that are necessary for fuel cell stack integration, mounting and operability, but which are not part of the fuel cell system itself. Such integration demands tight dimensional tolerances of the assembled fuel cell stack.\nAssembled height variance along the stacking direction (which may be thought of as the Z-axis in a conventional Cartesean coordinate system) may be as much as 5 to 10 percent of the overall stack height; the present inventors have noticed variances of up to about \u00b18 millimeters (or 16 millimeters total) along the stacking direction. Such variances in stack height make it difficult to design a stack possessive of consistent, repeatable dimensions. This in turn hampers subsequent BOP connection and overall system placement within the corresponding vehicle compartment. As such, it would be advantageous to"} +{"output_text": " as slip increases. The critical slip value is dependent upon the road surface condition.\nWhen the brakes of a vehicle are applied, the wheel begins to rotate. As the wheel rotates, the wheel begins to slip against the road surface. The slip between the wheel and the road surface is dependent upon the road surface condition and the amount of slip between the wheel and the road surface. As the slip increases, the braking force increases. As the slip increases, the wheel begins to rotate faster. As the", "input_text": " FIG. 3, the non-shooting hand is not even close to the basketball, as the basketball is released from the shooting hand. Consequently, the ball in the Coddens' design would travel a much longer distance with only the shooting hand providing control. This condition would have two negative consequences. Firstly, defenders would find it very easy to deflect the ball from the shooter's hand. This often happens close to the basket where conditions are very crowded. Secondly, players with smaller hands would find it difficult to maintain control of the ball.\nFurthermore, it is noted that the Coddens' design includes a loop around the base of the index finger of the non-shooting hand. Even if this device would allow the non-shooting hand to extend close to the point of release, all four fingers, the index finger, the middle finger as well as the fourth and fifth finger can still interfere with the accuracy of the shot by allowing these fingers to drag on the ball. This is due to the fact that the fingers can still bend and form to the curvature of the ball. Finally, it is important to note that this loop is provided around the index finger and not the middle finger of the non-shooting hand. It is the middle finger that sends the strongest neurological message to the non-shooting hand and adjacent fingers to pull away and straighten, thereby eliminating any drag interference on the side of the ball that would cause shooting inaccuracy. This invention relates to an antilock control method for vehicle wheel brakes.\nWhen the brakes of a vehicle are applied, a braking force is generated between the wheel and the road surface that is dependent upon various parameters which include the road surface condition and the amount of slip between the wheel and the road surface. This braking force increases as slip increases until a critical value of slip is surpassed. Beyond the critical value of slip, the braking force decreases"} +{"output_text": ". The release film can be made of a material, which is not adhesive, but which is capable of being torn off. The release film can be made of a material, which is adhesive, but which is not capable of being torn off. The release film can be made of a material, which is capable of being torn off and which is adhesive. The release film can be made of a material, which is not adhesive, but which is capable of being torn off and which is not capable of", "input_text": " made of glass or a plastic material or has a fatty film, an efficient adhesive substance being required, whereas such a quality of the adhesive substance is objectionable, if it results in inconveniences regarding the removal of said tearing-off section from the primary area. Since it is practically impossible to, on the same surface, apply two adhesives having different properties, the problem must be solved in another satisfactory and relible way, which so far has not been achieved.\nThe object of the present invention is to counteract and as far as possible eliminate the above-mentioned invonveniences. Another object of the invention is to develop the state of the art in this field and to make possible a quicker, simpler and more reliable production and use of such labels as well as to develop new secure ways to handle the labels and achieve quality results.\nThese objects are attained according to the invention by designing a label of the type described in the introduction mainly in such a way, as is set forth in the characterizing clause of claim 1. Whereas the release layer according to the German specification is broken, partly in order to let through ink or pencil lead particles and partly to allow a reliable release from another label during a labelling procedure, the release layer according to the present invention is designed to guarantee a satisfactory fastening and release of portions of one and the same label as well as to make the label production more economical, more uniform and simpler. In the known case the label is provided with a release layer, print and writing respectively in two or three consecutive steps. According to the present invention these two or three steps are summed up in just one step, since the release layer at the same time belongs to the printing of the label for the rest. According to the invention a release film is a necessary component. The release surface of the release film and the release area of the label can have release properties, which differ from each other"} +{"output_text": " thereby display a gradation image.\nHowever, in the driving method described in JP-A-10-232649, since the light emitting time within the one-frame time is changed to thereby display a gradation image, the light emitting time within the one-frame time is changed for each of the eight sub-frames, so that the luminance of the light emitting time within the one-frame time is changed for each of the eight sub-frames, thereby causing a problem that the luminance", "input_text": " and accordingly improve the uniformity of display.\nHowever, in the system described in JP-A-2001-100655, since a source electrode, acting as a reference voltage for a transistor for driving an organic LED in a pixel is connected to an LED common electrode shared by the panel, some voltage drop is produced between the source electrode and common electrode. For this reason, even if the same signal voltage is applied, the gate-source voltage, which determines the operating point of the transistor, varies in response to variations in the source voltage, thereby encountering difficulties in removing the non-uniformity of display.\nAlso, the foregoing system has such a nature that variations in a threshold value, i.e., the on-resistance of a driving TFT for driving an EL cause a change in an EL driving current even if the same signal voltage is applied for controlling the current, so that TFTs which exhibit few variations and uniform characteristics are required for implementing the system. However, transistors for use in realizing such a driving circuit are obliged to be low temperature polysilicon TFTs which are manufactured using a laser anneal process and are high in mobility and applicable to a large-sized substrate. However, the low temperature polysilicon TFTs are known to suffer quite a few variations in element characteristics. Thus, due to the variations in the characteristics of TFTs used in an organic EL driving circuit, the luminance varies pixel by pixel, even if the same signal voltage is applied, so that the low temperature polysilicon TFT is not suitable for displaying a highly accurate gradation image.\nAs a driving method for solving the foregoing problems, JP-A-10-232649, for example, proposes a driving method for providing a gradation display which divides a one-frame time into eight sub-frames which are different in display time, and changes a light emitting time within the one-frame time to"} +{"output_text": " system to handle calls that are not being handled by the primary agents.\nThe manager also monitors the status of the agents assigned to a split. For example, the manager might monitor the number of calls handled by each agent, the number of calls that are not handled, the number of calls that are being handled by the primary agents, and the like. The manager also monitors the time spent by each agent on a call. For example, the manager might monitor the time spent by each agent on a", "input_text": " piston/cylinder seal. 1. Field of the Invention\nThe present invention relates to improvements in monitoring and managing agents and telephone call activity in telephone call centers. More particularly, the present invention relates to improved graphical user interfaces for monitoring and displaying the status of telephone call center agents.\n2. Background of Related Art\nTelephone call centers (or call centers) are networked groups of telephone operators or xe2x80x9cagentsxe2x80x9d that provide customer service for telephone callers. Call centers can be in many different forms, from large Operator Service Systems (OSSs) under the control of telephone companies to smaller private ones such as corporate customer service centers and telemarketing groups.\nAn important function of a call center is to provide efficient service to all customers, including timely and satisfactory handling of all received calls. Prior art automatic call distribution (ACD) systems are software hardware hybrids for helping to efficiently switch incoming telephone calls to suitable and available operators. Notwithstanding the use of an ACD system, a call center has one or more human managers monitoring all or a designated portion of the calls received and handled by it.\nCall center agents are often grouped according to xe2x80x9csplits.xe2x80x9d A split can be a type of service provided during a telephone call or a type of skill possessed by an agent. For example, one split might handle credit card orders, another might handle customer complaints, and yet another split might handle technical support. A split manager monitors the calls received by a split and either assigns calls or overrides the ACD system when thought necessary. In addition to assigning calls or overriding the ACD, the manager often adjusts the parameters of the ACD to influence the ACD behavior. For example, the manager could assign some back-up agents to work in a busy ACD"} +{"output_text": "rendered and stored on the content server. This allows for a more efficient use of network bandwidth, but requires that the content server be able to support a high bandwidth connection open to all the nodes. When new nodes are added, they are provided an address at which the server can connect to them. The balance of the provisioning is performed as a server side task. This centralization provides the administrator of the network with the knowledge of all nodes in a network as no node receives data without being centrally provision", "input_text": " This becomes a problem as the signage network grows, and a centralized server is responsible for transmitting unnecessary data to each display.\nConventionally, if a display is provisioned to retrieve only its location specific data, the display generates traffic on the network when it checks to see if new data is available on the content server. This polling of a centralized content server generates a small amount of traffic, but as the number of nodes in the network grows, the bandwidth consumed by this polling increases. Unless the time gap between polling events is increased as the network increases in size, the scalability of the system decreases.\nExisting advertising networks rely on central provisioning for a number of reasons, but one of the foremost reasons is that with the correct provisioning tools, the administrator of a subset of the overall network could errantly program the displays on another portion of the network. The provisioning tools are thus created in various versions so that the central authority can access all functions and devices, and so that administrators of subsets are provided certain access rights to the screens they have authority over. This allows for centralized control, but results in great difficulty if a small number of screens are needed to display a customized selection of data, or are needed to use a customized template specific only to those screens.\nCommunications between the displays and the centralized content sources in existing display networks tend to be direct connections. Each node directly obtains content from the centralized data source, requiring that the centralized data source be able to support a high bandwidth connection open to all the nodes. When new nodes are added, they are provided an address at which the server can connect to them. The balance of the provisioning is performed as a server side task. This centralization provides the administrator of the network with the knowledge of all nodes in a network as no node receives data without being centrally provisioned.\nAnimation effects and rendering of content is often pre-"} +{"output_text": " to be detected and treated.\nU.S. Pat. No. 6,971,541 discloses a device for monitoring respiration of a patient. The device includes a sensor for sensing a patient's respiration, a processor for processing the sensed respiration, and a display for displaying the processed respiration. The device also includes a memory for storing the sensed respiration and the processed respiration, and a transmitter for transmitting the sensed respiration and the processed respiration to a remote location. The device further includes a receiver", "input_text": ". 5,348,530 discloses a pneumatic ankle brace with a bladder and foot pump arrangement. The device of this patent is of rather complicated construction and requires use of a detachable hand-held pump.\nU.S. Pat. No. 4,841,956 discloses a device adapted to be mounted to the lower leg and foot of a person for inducing venous blood flow in the leg. This device includes a pulse generator and programmable distributor necessitating a non-ambulatory position for the wearer during use.\nU.S. Pat. No. 4,678,945 discloses a self-inflating ankle brace including air bags with resilient, compressible filler material. This patent discloses only a brace.\nU.S. Pat. No. 6,322,530, assigned to the instant assignee and incorporated herein by reference in its entirety, discloses a wrap made of a plurality of stretchable flexible straps. The straps wrap around the foot to hold in place one aircell positioned in the vicinity of the Achilles tendon and another aircell positioned in the vicinity of the arch of the foot, the two aircells being operatively connected to one another through a conduit member. As the user walks and steps on the aircell at the arch, that aircell is compressed, and the pressure in the aircell at the Achilles tendon is increased. As the user step off the arch aircell, the airflow is reversed, and air travels back from the Achilles tendon aircell to the arch aircell, ready for the next cycle. This device provides effective pneumatic compression of the Achilles tendon, but can be difficult for the user to apply and adjust properly. It is highly desirable to reliably track respiration within patients having pacemakers and ICDs. Tracking patient respiration permits potentially dangerous respiratory disorders, such as apnea,"} +{"output_text": " on Icebergs and Ice Shelters, held in New York City, N.Y., on Apr. 27-May 1, 1973. The structures illustrated in the paper are conical in shape and are constructed of steel. The structures are designed to be used in ice-infested waters and are intended to be placed in the path of ice floes. The structures are designed to be placed in the path of ice floes by being placed on the ice floes and being pushed upwardly by", "input_text": " in their paths.\nA still more severe problem encountered in arctic waters is the presence of larger masses of ice such as pressure ridges, rafted ice or floebergs. Pressure ridges are formed when two separate sheets of ice move toward each other and collide, the overthrusting and crushing of the two interacting ice sheets causing the formation of a pressure ridge. Pressure ridges can be very large, with lengths of hundreds of feet, widths of more than a hundred feet and a thickness of up to 50 feet. Consequently, pressure ridges can exert a proportionally greater force on an offshore structure than ordinary sheet ice; thus, the possibility of pressure ridges causing extensive damage to an offshore structure or the catastrophic failure of a structure is very great.\nA structure built strong enough to resist the crushing force exerted thereon by impinging ice, that is, strong enough to permit the ice to be crushed against the structure, enabling the ice to flow around it, would likely be very massive and correspondingly expensive to construct. Therefore, it has been proposed heretofore that structures which are to be used in ice-infested waters should be built with a sloping or ramp-like outer surface rather than with a surface which is vertically disposed to the impinging ice. As the ice comes into contact with the sloping outer surface, it is forced upwardly above its normal position which causes the ice to fail in flexure by placing a tensile stress in the ice. Since ice has a flexural strength of about 85 pounds per square inch, a correspondingly smaller force is imposed on the structure as the ice impinging thereon fails in flexure rather than compression.\nSeveral forms of conical offshore structures having sloping outer surfaces are illustrated in a paper by J. V. Danys entitled \"Effect of Cone-Shaped Structures on Impact Forces of Ice Floes\", presented to the First International Conference"} +{"output_text": " region of the polypeptide of the present invention, and the multimer is formed by the disulfide bond between the antibody recognition region and the hinge region of the antibody.\nThe multimer of the present invention can be produced by a method of expressing the multimer by using a vector having a gene encoding the polypeptide of the present invention and a gene encoding the antibody recognition region, or a method of expressing the multimer by using a vector having a gene encoding the polypeptide of the present invention and a gene encoding", "input_text": " group or the non administered group of mice. Further, in case of searching in vitro cultivation and growth of hemopoietic undifferentiated cells including hemopoietic stem cells, the bone marrow cells of mice are cultured in the groups with or without addition of the compound of the present invention, and the cultured cells are transferred into the lethal dose irradiated mice. Result of recovery is observed with the indications of survival rate and variation of blood counts. These results can be extrapolated to the humans, and accordingly useful effective data for evaluation of the pharmacological activities of the compound of the present invention can be obtained.\nApplications of the compound of the present invention for pharmaceuticals include diseases with abnormal differentiation of cells, for example leukemia and malignant tumors. These are cell therapy, which is performed by culturing human derived cells in vitro while maintaining their original functions or adding new functions, and a therapy, which is performed by regenerating without damaging the functions orginally existing of the originally existed in the tissues by administering the compound of the present invention under the regeneration after tissue injury. Amount of administration may differ in the type of preparation and ranges from 10 xcexcg/kg to 10 mg/kg.\nFurther strong physiological activity can be achieved by expression of forming multimer of the polypeptide of the present invention.\nAs shown in Example 10, since the suppressive action of human Delta-1 and human Serrate-1 is stronger in the IgG chimera protein having dimer structure, a form of stronger physiological activity is preferably expressed in the form of multimer formation.\nHuman Delta-1 and human Serrate-1 having multimer structure can be produced by a method of expressing chimera protein with human IgG Fc region as described in the example and expressing the multimer having disulfide bond with hinge region of the antibody, or a method expressing chimera protein, in which antibody recognition region is expressed in the C-terminal"} +{"output_text": " voltage V05 (e.g., 500 mV) on the bitline BLB. The voltage V05 is compared against the reference current IREF to provide the digital output OUT.\nFIG. 2K shows an example of a known PCM single ended sense amplifier 2600. Generally, in a single ended sense amplifier, a cell read output conducted by a selected bitline BLB is compared against a reference current to provide a digital output OUT. When the PRECHARGE signal turns on transistor", "input_text": " and reset resistances, also decreases over time. Larger sense margins generally result in more reliable reads, and a sense margin which is too small may not permit reliable reading at all. 2G represents the approximate behavior of one known PCM material; other PCM material compositions may behave differently. For example, other PCM material compositions may display variation of the set resistance over time.\nThe downwards drift of reset resistance may be due to, for example, shrinking size of the amorphous zone of the phase-change material, due to crystal growth; and, in some cells, spontaneous nucleation steepening the drift curve (possibly only slightly) due to introducing further conductive elements into the mushroom-shaped programmable region.\nFIG. 2H shows an example of a processing system 2300. Typically, a processing system 2300 will incorporate at least some of interconnected power supplies 2310, processor units 2320 performing processing functions, memory units 2330 supplying stored data and instructions, and I/O units 2340 controlling communications internally and with external devices 2350.\nFIG. 2I shows an example of a PCM single ended sensing memory. Two different PCM cells 2400 on different ends of a sense amplifier can be selected separately. Selected elements 2410 are separately sensed by a single-ended sense amplifier 2420.\nFIG. 2J shows an example of a known PCM single ended sense amplifier 2500. Generally, in a single ended sense amplifier, a cell read output conducted by a selected bitline BLB is compared against a reference current to provide a digital output OUT. When the PRECHARGE signal turns on transistor 2530, voltage V04 (e.g., 400 mV) precharges the bitline BLB. After precharge ends, the READ signal turns on transistor 2550. Transistor 2550 is connected, through source follower 2560 and load 2580, to provide a"} +{"output_text": " (PMMA) matrix and a cyanine dye. The PMMA matrix is a rigid polymer that is not soluble in common organic solvents. The cyanine dye is a rigid organic molecule that is not soluble in common organic solvents. The PMMA matrix and the cyanine dye are not soluble in common organic solvents. The PMMA matrix and the cyanine dye are not soluble in common organic solvents. The PMMA matrix and the cyanine dye are not soluble in common organic solvents. The PMMA", "input_text": "PS-type techniques is consequently not perfect as well. In addition, the DGPS receivers are more complex, and therefore more expensive, than ordinary GPS receivers.\nIn the transportation industry, it is important to know which path a vehicle has taken from among a plurality of possible fixed paths. In particular, in the railroad industry, it is important to know whether a train is on the correct track after passing a switch. If the switch is set at an incorrect position and the train has taken the wrong track, a collision may result. Ideally, track switches are set at the correct position so that a train will take the correct track and, in the event the switch is not correctly set, a train operator will stop the train before or shortly after passing the switch. However, human beings are imperfect and prone to mistakes. Thus, it would be desirable to have a system that can automatically determine whether a correct path has been taken. However, in many situations, alternate paths are often separated by a distance less than the accuracy of a GPS system receiver and are therefore not spaced far enough apart to permit an unambiguous determination as to which of two or more alternate paths have been taken by a vehicle.\nTherefore, what is needed is a system, method and apparatus for determining whether a vehicle has taken a correct path when alternate paths are separated by a distance less than the accuracy of a positioning system receiver. The present invention relates to holographic recording materials (HRMs) having a polymer matrix and a light harvesting dye.\nThe fundamental aspect of an HRM is to utilize a photochemical phenomenon wherein the light harvesting dye absorbs light, reacts with the polymer matrix, and alters the index of refraction. These induced refractive index modulations result in phase holograms with high diffraction efficiency and angular selectivity. Previous HRMs are well known, but the HRM closest to the subject invention is limited to a poly(methyl methacrylate)"} +{"output_text": " 22. The touchpad 10 measures the amount of charge that must be injected onto the sense line 16 to reestablish or regain balance of charge on the sense line. The touchpad 10 determines the amount of charge that must be injected onto the sense line 16 by measuring the amount of charge that must be injected onto the sense line 16 to reestablish or regain balance of charge on the sense line 16. The touchpad 10 determines the amount of charge that must be injected onto the sense line 16 by measuring", "input_text": " of X (12) and Y (14) electrodes and a sense electrode 16 is used to define the touch-sensitive area 18 of the touchpad. Typically, the touchpad 10 is a rectangular grid of approximately 16 by 12 electrodes, or 8 by 6 electrodes when there are space constraints. Interlaced with these X (12) and Y (14) (or row and column) electrodes is a single sense electrode 16. All position measurements are made through the sense electrode 16.\nThe CIRQUE\u00ae Corporation touchpad 10 measures an imbalance in electrical charge on the sense line 16. When no pointing object is on or in proximity to the touchpad 10, the touchpad circuitry 20 is in a balanced state, and there is no charge imbalance on the sense line 16. When a pointing object creates imbalance because of capacitive coupling when the object approaches or touches a touch surface (the sensing area 18 of the touchpad 10), a change in capacitance occurs on the electrodes 12, 14. What is measured is the change in capacitance, but not the absolute capacitance value on the electrodes 12, 14. The touchpad 10 determines the change in capacitance by measuring the amount of charge that must be injected onto the sense line 16 to reestablish or regain balance of charge on the sense line.\nThe system above is utilized to determine the position of a finger on or in proximity to a touchpad 10 as follows. This example describes row electrodes 12, and is repeated in the same manner for the column electrodes 14. The values obtained from the row and column electrode measurements determine an intersection which is the centroid of the pointing object on or in proximity to the touchpad 10.\nIn the first step, a first set of row electrodes 12 are driven with a first signal from P, N generator 22, and a different but adjacent second set of row electrodes are driven with a second signal from the P, N generator"} +{"output_text": " predetermined intervals, and the other of which receives the signal. The transmitter/receiver devices are connected to a control device, which controls the operation of the transmitter/receiver devices.\nIn the event of a failure in the signal transmission path, the control device is unable to detect the existence of a train in the block section. In such a case, the control device is unable to control the operation of the transmitter/receiver devices, and the transmitter/receiver devices are unable to detect", "input_text": "idium silver pentaiodide fast ion conductor, while other silver ions from the fast ion conductor are injected into a tungsten oxide electrochromic material causing it to turn blue. It is disclosed that the second electrode can be very small, such as a Dag contact placed on the surface of the fast ion conductor layer (as opposed to a continuous film adhered to the fast ion conductor), and serves merely as a source of the fast metal ions. The second electrode does not participate in the modulation of the transmitted or reflected electromagnetic radiation.\nIt would be desirable to prepare an electromagnetic radiation modulating device, wherein modulation of the transmissivity and reflectivity of electromagnetic radiation could be precisely controlled over a wide range. Such a device would be particularly useful were it able to substantially reduce the transmission of infrared radiation as well as visible light rays. Thus, the device could be used to prevent the passage of heat energy therethrough, and would therefore be especially suited for use as an automotive or architectural glazing. Furthermore, the usefulness of such a device would be particularly enhanced were it able to maintain an established transmissivity or reflectivity after the removal of an electrical potential. The present invention relates to a method of detecting a train in a block section using a track circuit, and particularly to a train detecting method which is capable of maintaining safety even in the event of a failure in a signal transmission path of the track circuit.\nA conventional railway system employs a method which uses a track as part of a signal transmission path to detect the existence of a train in a block section. In such a method, the track is electrically divided into plural sections, each having a predetermined length. Such a section forms a part of an electric circuit, which is commonly referred to as a track circuit. At respective ends of each track circuit, there are arranged transmitter/receiver devices, one of which transmits a signal for detecting a train continuously or at"} +{"output_text": "rapeutics, Apr. 27-May 1, 2004, Nice, France; R. Humphreys, et al., Ann. Oncol. 2004, 15: iii102, abstr. 383PD).\nThe TRAIL-R3 (DR5) is a type II transmembrane protein that is expressed on the cell surface. TRAIL-R3 is a member of the TNF receptor family and is a receptor for TRAIL. TRAIL-R3 is expressed in a wide variety of", "input_text": ", TNF\u03b2, FasL, and other TNF superfamily ligands demonstrated the ability to both initiate apoptosis and kill transformed cells, virally infected cells, and chronically activated T cells and B cells (S. R. Wiley et al., Immunity 1995, 3: 673-682; P. T. Daniel and P. H. Krammer, J. Immunol. 1994, 152: 5624-5632).\nHowever, systemic therapeutic administration of recombinant TNF\u03b1 (and similarly TRAIL in animal studies) in cancer patients results in a massive inflammatory response, as well as direct hepatotoxicity (A. L. Jones and P. Selby, Cancer. Surv. 1989, 8: 817-836).\nOne version of recombinant TRAIL, Apo2L/TRAIL (PRO1762), is currently being studied in phase I clinical trials. (A. Almasan and A. Ashkenazi, Cytokine Growth Factor Rev. 2003, 14: 337-348).\nHGS-ETR1 (mapatumumab; Human Genome Sciences), a fully human agonistic monoclonal antibody that targets TRAIL-R1, is in phase II evaluations in patients with advanced malignancies. Fully human monoclonal antibodies to TRAIL-R2 (HGS-ETR2; Human Genome Sciences), such as HGS-TR2J, have also entered the clinic and are currently in phase I clinical development. HGS-ETR2 and HGS-TR2J have slightly different physiochemical and kinetic profiles that warrant the exploration of both in the clinic (R. Humphreys, et al., Ann. Oncol. 2004, 15: iii102, abstr. 383PD; R. Humphrey, et al., Presented at the 16th EORTC-NCI-AACR Symposium on Molecular Targets and Cancer The"} +{"output_text": " a surface to which the capture entity can be attached.\nThe capture entity can be attached to the solid phase in a number of ways. The most common is to use a chemical linker to attach the capture entity to the solid phase. This can be done by reacting a functional group on the capture entity with a reactive group on the solid phase. This can be done by reacting a functional group on the capture entity with a reactive group on the solid phase. This can be done by reacting a functional group", "input_text": " as part of their natural functioning, such as in antibody/antigen interactions in disease resistance, or receptor/ligand interactions for cell signalling. This can be harnessed in a separation technique to obtain 100% separation in one step and is therefore a particularly powerful method. With such a powerful separation method it is important when performing these activities that all contaminating unbound materials are removed with high efficiency. Preferably there should be 100% efficiency both of contaminant removal and of capture of desired substance in the first pass. This requires maximal interaction between the desired substance or entity with its binding partner as well as very efficient washing.\nA method is therefore needed where sufficient capture molecules to capture all the target molecules are held on a solid phase. This is done in such a way that the target molecule or entity has easy access to the binding site but also so that non-binding moieties can be washed off vigorously without trapping or spuriously interacting.\nThere are many formats for performing affinity chromatography procedures. All of them share the general feature of having one side of the binding pair immobilised to a solid phase. This most commonly consists of a bead material sometimes packed into columns. The liquid containing the desired entity can then flow over, around and in some cases inside the beads coming into contact with the capture entity and contaminants can then be washed by allowing washing solutions to flow over the bead surfaces. The stringency of the washing procedure can be influenced by the nature of the washing solution such as by temperature, ionic strength, pH, solvent mix etc.\nBeads are in many cases a preferred option as this format maximises the surface on which the capture entity is immobilised.\nIn general the major requirement is for an insoluble material to which the capture entity can be attached such that a fluid containing the target be passed over the solid phase to allow maximum contact between the compartments. Also the solid phase normally requires"} +{"output_text": " of the sample population are representative of the larger population.\nThe sample population is typically selected by a random process. For example, a company may select a sample of households in a particular city. The company may then contact each household in the sample to determine the viewing habits of the household. The company may then use the viewing habits of the sample population to extrapolate the viewing habits of the larger population.\nThe sample population is typically selected by a random process. For example, a company may select", "input_text": " detector is used to detect the modulation. The CCK codewords are an eight chip Walsh code that can be decoded with a fast Walsh transform. The correlators typically implement the transform as a butterfly function comprising 64 separate correlations requiring 512 complex additions to decode the 64 subcodes which are used to estimate six bits of reconstructed data. The remaining two bits of the signal are demodulated using the DQPSK demodulation. For 5 Mbits/s operation, 28 butterflies and 112 complex additions are required to decode two bits.\nIn a pending U.S. patent application, the inventor and others have disclosed a method of extending the data rate of a DSSS WLAN through the use of bandwidth efficient M-ary phase shift keying modulation. While the signals of the extended data rate system are structurally similar to those of the higher data rate IEEE 802.11b CCK operating modes, the information bits encode 4096 codewords for transmission. Correlation of the signal utilizing the process of the IEEE 802.11b CCK modes would require a substantial bank of correlators significantly increasing the cost and complexity of the transceiver. What is desired, therefore, is a method of correlating an M-ary PSK waveform that reduces the number of correlators required in a reciever. 1. Field of Invention\nThe invention relates generally to the field of computer-assisted data manipulation and analysis. Specifically, in one exemplary aspect, the invention relates to methods and apparatus for collection and classification of data regarding an audience in a content-based network such as a cable television or satellite network.\n2. Description of Related Technology\n\u201cNielsen Ratings\u201d are a well known system of evaluating the viewing habits of cross sections of the population. When collecting Nielsen ratings, companies use statistical techniques to develop a sample population which is a cross section of a larger national population. Theoretically, the viewing habits"} +{"output_text": " provided with an adhesive coating on the inner surface, whereby, when folded together along the fold line between the fourth and fifth panels, the inner surfaces of the two adjacent panels are secured together. The fifth panel is connected along a fold line to the upper edge of the second side panel along a fold line. The fifth panel is also connected along a fold line to the upper edge of the extended panel along a fold line. The fifth panel is also connected along a fold line to the upper edge of the", "input_text": " a front or rear wall panel. As in the previous aspect, additional printing surface area is provided by at least one extended exterior panel that is connected to an edge of one of the carton side walls. The various features and refinements mentioned above in the first aspect for the extended panel may also be used with this second aspect. At least one of the surfaces of this extended panel is coated for printing.\nAnother aspect of the present invention is to provide a carton blank for folding into a carton and having a plurality of adjacent panels which form a front wall, rear wall, two side walls, an extended exterior panel, and end closure panels. The blank is comprised of multiple rectangular panels, all of which have substantially the same width but varying lengths. A coating is conventionally provided on one surface of the blank for printing, images, or other indicia. The box is so folded that the coated side forms the exterior of the box and the uncoated side forms the interior of the box. One of the rectangular panels forms a front wall having front and rear edges defined by the long sides of the panel. Connected to the front wall along a fold line is the rectangular panel forming a side wall and having an upper and lower edge. A third rectangular panel, connected along a fold line to the lower edge of the side panel, forms the rear wall. A fourth rectangular panel connected along a fold line to the front edge of the rear wall, forms a second side wall. A fifth and sixth panel of the same size and connected along fold lines, is connected to the upper edge of the second side panel along a fold line. One of the two panels is provided with an adhesive coating on the inner surface, whereby, when folded together along the fold line between the fourth and fifth panels, the inner surfaces of the two adjacent panels are secured together. Alternately, the fifth, or extended, panel can be"} +{"output_text": " receive data over the air, minimizing the need for wired connections. Thus, wireless LANs combine data connectivity with user mobility.\nIn a wireless LAN, each user terminal (referred to as a \u201cclient\u201d) communicates with one or more access points (referred to as \u201cbase stations\u201d) which are connected to a network. Typically, the client communicates with the base station using an analog radio frequency (RF) signal. However, to support the growing demand for mobile communications, the wireless", "input_text": " embodiment further comprises a mount removably fastened to the front portion of the tray. Further, a first leg is connected to the mount wherein the first leg extends down from the front portion. The first leg additionally has a first curvilinear portion positioned to curve outwardly from the front portion.\nA second leg is connected to the mount wherein the second leg also extends down from the front portion. The second leg further has a second curvilinear portion positioned to curve outwardly from the front portion.\nAdditionally a first leg support connects to the first leg opposite the mount and a second base support connects to the second leg opposite the mount in order to connect the first leg and the second leg to the wheelbarrow.\nAn advantage of the present invention is to provide a rollbar support unit that efficiently and conveniently pivots the tray in a smooth manner.\nAnother advantage of the present invention is to provide a rollbar support unit that can adequately support a heavy load.\nAnother advantage of the present invention is to provide a rollbar support unit that eliminates multiple assembly elements.\nAnother advantage of the present invention is to provide a rollbar support unit that can be conveniently and efficiently be connected to the wheelbarrow.\nStill further advantages will become apparent from a consideration of the following descriptions and drawings. 1. Field of the Invention\nThe invention generally relates to pipeline ADC (Analog to Digital Converter) units, and in particular to WLAN (Wireless Local Area Network) communication devices such as transmitters, receivers and transceivers, and corresponding integrated circuit chips and methods, where pipeline ADC units are used for converting analog transmission and/or reception signals to digital data.\n2. Description of the Related Art\nA wireless local area network is a flexible data communications system implemented as an extension to or as an alternative for, a wired LAN. Using radio frequency or infrared technology, wireless LANs transmit and"} +{"output_text": " made of a magnet.\nAccording to a further development of the invention, the operating unit is constructed as a rotary toggle. The rotary toggle is preferably constructed as a rotary slide.\nAccording to a further development of the invention, the operating unit is constructed as a rotary slide. The rotary slide is preferably constructed as a rotary slide.\nAccording to a further development of the invention, the operating unit is constructed as a rotary slide. The rotary slide is preferably constructed as a rotary slide.\nAccording", "input_text": " at least one functional unit of the electrical appliance, has an operating unit with a rotary toggle. A rotary toggle has the advantage that radial position signals can be very readily differentiated. On generating a position signal as a function of a specific rotation, in the case of several rotations of the rotary toggle a randomly large number of position signals can be generated.\nThe rotary toggle is mounted so as to rotate on a bearing device forming part of the operating device. There are also signalling means with which, as a function of a position and/or a position change of the rotary toggle, signals can be generated. For processing said signals the operating device has a control mechanism. As a function thereof the electrical appliance or a functional unit thereof is influenced or controlled.\nAccording to the invention, in addition to a rotary movement, the rotary toggle is movably mounted in at least one direction in such a way that it can perform a linear movement. For this purpose further signalling means are provided for detecting the linear movement of the rotary toggle and which can be converted into signals for the control mechanism.\nThus, a combined rotary-sliding toggle for operating an electrical appliance is created. It is advantageously possible by a linear or sliding movement in a specific direction to make a specific choice for a power scale or the like. This power scale can then be adjusted by a subsequent rotary movement. It is also possible by means of a rotary movement to pass or leaf through a menu. With a corresponding linear movement it is then possible to jump into a submenu or to perform a selection or confirmation.\nAccording to a further development of the invention, the bearing device can be constructed in such a way that the operating unit is removably fixed to a cover of the electrical appliance. For this purpose a holding power is provided, which is preferably non-contacting or has an external power effect. Advantageously use is e.g."} +{"output_text": "), the current macroblock is compared with the regions A1, A2, A3,... respectively within a specified searching range centered around the region A0 within the reference frame corresponding to the current macroblock, wherein the regions A2, A3,... have shifted from the preceding regions A1, A2,..., by one pixel starting from the upper left comer A1. This method requires the small-scale circuit for the processing and the small-scale capacity for the memorizing due to the", "input_text": " other words, as methods for searching for the region (=reference macroblock) bearing the closest resemblance to the current macroblock within the above mentioned searching region, the following are available.\n*Full Searching (FS) Method\nAs illustrated in FIG. 5(a), the current macroblock is compared with the regions A1, A2, A3,... respectively within a specified searching range centered around the region A0 within the reference frame corresponding to the current macroblock, wherein the regions A2, A3,... have shifted from the preceding regions A1, A2,..., by one pixel starting from the upper left comer A1. This method requires the large-scale circuit for the processing and the large-scale capacity for the memorizing due to enormous number of regions to be compared but the precision of the motion vector detection is high.\n*Logarithmic Searching (LS) Method\nAs illustrated in FIG. 5(b), comparison is made, for example, from the upper left.fwdarw.upper.fwdarw.upper right.fwdarw.left.fwdarw.center.fwdarw.right.fwdarw.under left.fwdarw.under.fwdarw.under right in this order within a specified searching range centered around the region A0 within the reference frame corresponding to the current macroblock. Incidentally, the order is an example, and the comparison may be done by another order. Then, the searching range is narrowed toward the region A3 which bears the closest resemblance to the current macroblock in the above 9 regions, and comparison is repeated in this way. As the narrowing of the searching range is logarithmic, this method is called \"logarithmic searching method. \"\n*Telescopic Searching (TS) Method\nAs illustrated in FIG. 5(c"} +{"output_text": " of the groove being parallel to the plane of the lens. The lens of the present invention utilizes a Fresnel surface which is sealed from its immediate environment. The Fresnel surface is molded into a lens body which is then sealed to the lens body by a thin layer of silicone. The lens body is then placed in a mold and a silicone gel is injected into the mold to form a lens body which is then sealed to the lens body by a thin layer of silicone. The lens body is then placed", "input_text": " spherical are generally stiffer, as one might expect, and therefore resist bending or folding and generally require a larger incision for insertion. The combination of Fresnel surfaces on the front and back side has proven self-defeating in other applications. Therefore, it may be assumed that the same would happen in the case of a two sided Fresnel optic, i.e. a Moire Pattern is formed. In other words, interference of the light is caused by the superimposition of two regularly spaced patterns, causing light and dark rings.\nThe open Fresnel lens has further shortcomings which relate primarily to the medical aspects of such devices. Since the Fresnel surface may be placed in the sulcus of the eye or the capsular bag, the Fresnel surface is adjacent to the iris. The diameter of the iris, which is a variable opening in the eye, changes rapidaly as the ambient illumination changes. Under bright daylight conditions, the iris opening may be only a few millimeters, changing to wide open under darkened conditions. The action is entirely involuntary. The Fresnel surface in order to be effective must have sharp ridges at the juncture of individual lenticules. These, of course, could abrade the rear surface of the iris, causing inflammation. Also particulate pigment sheared from the iris or inflammatory cells may lodge in the grooves of the Fresnel, destroying its effectiveness.\nThe lens of the present invention overcomes the foregoing objections to a Fresnel lens plus adds some unique features not present in existing intraocular lenses of any type.\nThe lens of the present invention utilizes a Fresnel surface sealed from its immediate environment. Fresnel lenses are flat optical devices which focus light from a series of concentric grooves or lenticules which are molded or cut into a surface of the device. Each groove is trapezoidal in crosssection, with the face"} +{"output_text": " cobalt metal is disclosed in U.S. Pat. No. 4,961,917. The process comprises the steps of:\n(a) treating the concentrate or matte with an aqueous solution of a complexing agent to form a nickel-cobalt-copper sulphide solution;\n(b) leaching the nickel-cobalt-copper sulphide solution with an aqueous solution of an oxidizing agent to form a leach liquor;\n(c)", "input_text": " a female shaft 11. The irregular portion 16 has a form in which one tooth is removed to connect or merge two bottoms on both sides of the removed tooth into a widened bottom as an irregular portion 16. Two diametrically opposed irregular portions 16 may be provided as in another example shown in FIG. 4a and 4b. These spline teeth of particular patterns are finished by broaching. As shown in FIGS. 3b and 4b, a filling 17 as an irregular portion is provided in the bottom between two spline teeth of an irregular pattern of a male shaft 14 corresponding to the irregular portion 16 of the female shaft 11. The filling 17 is formed by spot welding so as to have a height not higher than that of the other spline teeth of the male shaft 14.\nWith the method shown in FIGS. 2a and 2b, it is difficult to work only one spline tooth of the male shaft 14 by tooth cutting to form the irregular portion. Moreover, with the method shown in FIGS. 3b and 4b, the filling 17 tends to fall off in use. (i) Field of the Invention\nThe invention relates to an improved hydrometallurgical process for the recovery of cobalt and nickel from nickel cobalt sulphides. More specifically, the invention relates to the separation of cobalt and nickel from an ammoniacal leach liquor to produce a substantially nickel-free cobaltic hexammine sulphate-containing solution wherein the formation of cobalt (III) hexammine sulphate ([Co(NH3)6]2(SO4)3) has been optimized from which overall enhanced recovery and increased production rate of high purity cobalt metal may be obtained.\n(ii) Description of the Related Art\nA hydrometallurgical process for the treatment of nickel-cobalt-copper sulphide concentrates and mattes to produce high grade nickel and"} +{"output_text": " the cap substrate 6. A piezoelectric element 7 is mounted on the movable portion 1 so as to be opposed to the cavity.\nIn the acceleration sensor having the above-described structure, when an acceleration is applied to the movable portion 1, the movable portion 1 is displaced in the direction of the arrow A in FIG. 5. The displacement of the movable portion 1 is transmitted to the piezoelectric element 7, and the piezoelectric element 7 is displaced in the direction of the arrow B in FIG. 5.", "input_text": " plane of its mass, characterized by the fact that it presents, through the adaptation between the TEM wave or mode and said peripheral wave or mode, a distribution of the effective permeability of the ferrite, spatially non-uniform in correspondence with the tapering zones or arms which are adapted for connection with the transmission lines. Henceforth, such spatial non-uniform distribution of the magnetic permeability will be indicated by the expression \"magnetic tapering.\" Said magnetic tapering is obtained, according to the present invention, through the use of magnetic fields, of spatially non-uniform polarization. Said magnetic fields of polarization in their turn are obtained by the use of permanent magnets and possibly with the addition of a ferro-magnetic element inserted into the magnetic circuit. 1. Field of the Invention\nThe present invention relates to an external force detecting sensor formed by using a semiconductor micro-processing technique or the like.\n2. Description of the Related Art\nGenerally, acceleration sensors and angular velocity sensors are known as external force detecting sensors. Each of these external force detecting sensors is provided with a movable portion which is displaced in accordance with an external force, such as acceleration, angular velocity, or the like applied to the external force detecting sensor. The displacement is electrically detected to obtain an acceleration signal or angular velocity signal. For example, as shown in FIG. 5, an acceleration sensor using a piezoelectric element described in Japanese Unexamined Patent Application Publication No. 10-104263 has a movable portion 1, which includes a weight portion 4 supported on a supporter 2 by beams 3 in the central portion thereof. A supporting substrate 5 and a cap substrate 6 having recesses 5a and 6a, respectively, are mounted to the supporter 2 so as to sandwich the supporter 2 from the top and bottom. In addition, a cavity is formed at the central portion thereof using the recesses 5a and 6a of the supporting substrate 5 and"} +{"output_text": " for H; F; Cl; Br; I; methyl; \u2014OH; \u2014NH2 or \u2014OR16; R4 stands for H; F; Cl; Br; I; methyl; \u2014OH; \u2014NH2 or \u2014OR16; R5 stands for H; F; Cl; Br; I; methyl; \u2014OH; \u2014NH2 or \u2014OR16; R6 stands for H; F; Cl; Br; I; methyl; \u2014OH;", "input_text": "C(CH3)2(CH2OH); R13, R16, R17, R22, R23 and R27 each independently stand for a radical selected from the group consisting of methyl, ethyl, n-propyl, isopropyl, n-butyl, sec-butyl, isobutyl, tert-butyl, n-pentyl, and ethenyl; R40 and R41 form, together with the interconnecting nitrogen atom as ring member, a radical selected from the group consisting of pyrrolidinyl, piperidinyl, piperazinyl, morpholinyl, and azepanyl, of which the heterocycloaliphatic moiety can in each case be unsubstituted or substituted by 1, 2, 3, 4, or 5 radicals R57;and R57 stands for an alkyl radical selected from the group consisting of methyl, ethyl, n-propyl, isopropyl, tert-butyl, n-butyl, sec-butyl, and isobutyl;in each case optionally in the form of one of the pure stereoisomers thereof, particularly enantiomers or diastereoisomers, or the racemates thereof or in the form of a mixture of stereoisomers, particularly the enantiomers and/or diastereoisomers, in an arbitrary mixing ratio, or in each case in the form of corresponding salts, or in each case in the form of corresponding solvates.\nVery special preference is given to compounds of the general formula Ic,\nin which D stands for N or CH; R1 stands for H; F; Cl; Br; or I; R1 stands for H; F; Cl; Br or I; R2 stands for H; F; Cl; Br; I; methyl; \u2014OH; \u2014NH2 or \u2014OR16; R3 stands"} +{"output_text": " are corrosive to the metal surfaces. For example, engine mounts are often exposed to fluids such as engine oil, transmission fluid, and coolant. These fluids are often corrosive to the metal surfaces of the engine mounts.\nThe elastomer-to-metal adhesive bonds are typically formed by applying a primer to the metal surface and then applying a covercoat adhesive to the primer. The primer is typically a two-part composition that includes a primer base and a catalyst. The primer base is typically", "input_text": " to formation of only four isomers of fosinopril. However, enantiomers, which are mirror images of each other are separated from the mixture, c) separation of the pure desired isomer from the racemic mixture calls for optical resolution, d) optical resolution leads to considerable wastage of the desired isomer leading to low efficiency and increase in cost of manufacture, e) no method is described for recycling of the unwanted isomers back to the desired one, and f) formation of polymorphic Form-A of fosinopril sodium is dependent on the solvent(s) employed and the amount of water present in the solvent(s).\nA need, therefore, exists for a method for the synthesis of fosinopril sodium, which in addition to eliminating/minimising the disadvantages, specially optical resolution associated with the prior art methods, provides a cost-effective and convenient method for synthesis of the objective compound. The present invention relates to an aqueous primer or coating, particularly a primer for use in polymeric material-to-metal adhesive bonding and a coating for protecting metallic surfaces.\nPrimers are often used as an undercoat in combination with a covercoat adhesive in order to achieve superior bonding between two substrates made from different materials. One particular application for such primers is in bonding metal surfaces to elastomeric surfaces. Elastomer-to-metal bonding is subjected to severe environmental conditions in many industrial and automotive assemblies. For example, many engine mounting assemblies that employ elastomer-to-metal bonding contain fluids in order to assist in damping of vibration of the engine. These fluid-filled engine mounting devices are being exposed to increasingly high temperatures such that the elastomer-to-metal adhesive bonds within the mounts are being exposed to very high temperature fluid environments. Many elastomer-to-metal assemblies, particularly those utilized in automobile applications, are routinely exposed to materials that"} +{"output_text": "VXPSn is a power supply voltage of the memory block) by the NMOS transistor MN11. The SSL node is charged to a high voltage VXPSnxe2x88x92Vtn by the NMOS transistor MN10. The SSL node is connected to the BLKWL node through the NMOS transistor MN5. The SSL node is connected to the BLKWL node through the NMOS transistor MN5 and the NMOS transistor MN10. The SSL node is", "input_text": " transistor MN11 is connected between the BLKWL node and a ground voltage and is turned on/off by an output signal of the NAND gate G6. An NMOS transistor MN10 is connected between an SSLGND node and a string selection line SSL and is turned on/off by the output signal of the NAND gate G6.\nThe switch block 46xe2x80x2 shown in FIG. 3 has the same structure as shown in FIG. 2 and will not be explained in further detail. The decoding block 42xe2x80x2, the precharge block 44xe2x80x2, and the NMOS transistors MN5, MN10, and MN11 constitute a block decoder circuit. The block decoder circuit and the switch block 46xe2x80x2 will repeatedly be present in each memory block so that each memory block may have the same circuit pattern. The SSLGND node has a ground voltage in read and program operations and has a power supply voltage in an erase operation.\nWhen the address signals DA1-DAi applied to the NAND gate G4 are high and the control signal is low, the output signal of the NAND gate G6 is made low. This allows the NMOS transistor MN10 and MN11 to be turned off. Such an operation is performed in a selected memory block. When one of the address signals DA1-DAi applied to the NAND gate G4 is low and the control signal BLKWLdis is high, the output signal of the NAND gate G6 is made high. This allows the NMOS transistors MN10 and MN11 to be turned on. Such an operation is performed in an unselected memory block.\nIn case of the selected memory block, the BLKWL node is charged to a high voltage VXPSnxe2x88x92Vtn ("} +{"output_text": ". The control valves are coupled to the respective hydraulic cylinders by a common hydraulic line. The control valves are operated by a common hydraulic pump. The pump is driven by the engine of the towing truck. The pump is driven by the engine of the recovery unit. The pump is driven by the engine of the recovery unit. The pump is driven by the engine of the recovery unit. The pump is driven by the engine of the recovery unit. The pump is driven by the engine of the recovery unit", "input_text": " made. Removal normally requires towing the disabled tractor by using a large tow truck specially equipped and dedicated for that purpose. However, such tow trucks are expensive and so in many areas of the country there are either none available or there will usually be considerable delay in obtaining the services of one when a breakdown occurs.\nOver the years, several attempts have been made to temporarily adapt an ordinary truck tractor to perform such towing tasks in addition to its normal use as part of a tractor and trailer rig. The objective is to eliminate the need to locate a dedicated tow truck in order to remove a disabled tractor from the highway to a service facility. The advantage in being able to use another tractor is that such tractors are found with much greater frequency in all parts of the country than are dedicated tow trucks and thus one would surely be quickly and conveniently available about anywhere the need might arise. Representative of the hoist and towing mechanisms devised in the prior art for this purpose is the recovery unit shown and described in U.S. Pat. No. 4,708,358 (Gehman). Specialized tow trucks for retrieving disabled truck tractors are limited in turning capability when towing a disabled truck tractor. The recovery unit as shown and described in U.S. Pat. No. 4,708,358 simulates a trailer in that the articulated frame locked into the operative position turns like a truck trailer about the pivotal connection between the recovery unit and the fifth wheel of the towing truck.\nThis recovery unit of U.S. Pat. No. 4,708,358 utilizes at least three hydraulic cylinders coupled to two control valves to effect an unfolding and operative connection to a disabled truck tractor. A major problem in operating the recovery unit constructed as described in U.S. Pat. No. 4,708,358 is the coordination required between the operation of the two control valves operating the respective hydraulic cylinders"} +{"output_text": "x80x9d), and rest (xe2x80x9ccatagenxe2x80x9d). The anagen phase is characterized by rapid cell division and elongation of the hair shaft. The catagen phase is characterized by cell differentiation and the onset of apoptosis. The telogen phase is characterized by the cessation of cell division and the onset of apoptosis.\nThe hair cycle is controlled by a complex interaction of genetic and environmental factors. The hair cycle is divided into three phases", "input_text": " of etching the glass substrate, the level of the etchant is lowered. Hence, the etching apparatus according to the related art recognizes that the etchant is normally drained since the nitrogen gas is discharged through the \u201cL(low)\u201d, \u201cH(high)\u201d, and \u201cHH(high high)\u201d nitrogen tubes 24a, 24b, and 24c, in order.\nUnfortunately, the etching apparatus according to the related art has the following problems or disadvantages.\nFirst of all, as the etching process is repeated, sludge as precipitates of the glass etched by the etchant blocks at least one of the \u201cL\u201d, \u201cH\u201d, and \u201cHH\u201d nitrogen tubes of the etchant detect sensor to perform the supply and discharge of the etchant abnormally. Hence, the etching apparatus according to the related art can cause failure of the etching process.\nSecondly, when the \u201cL\u201d nitrogen tube is blocked, as shown at 26 in FIG. 2, a cleaning process for cleaning the glass substrate with deionized water is carried out under the circumstance that the drain of the etchant is not completed. Hence, the etching apparatus according to the related art causes degradation of the cleaning work since the deionized water as the cleaning material is mingled with the etchant.\nFinally, the etching apparatus according to the related art consumes cost and time for replacement of the components or periodical cleaning works to prevent outlets of the \u201cL\u201d, \u201cH\u201d, and \u201cHH\u201d nitrogen tubes from being blocked by the sludge, thereby reducing productivity. The present invention relates to methods and pharmaceutical compositions for An treating and preventing alopecia in a a patient in need thereof.\nHair growth is not continuous, but comprises alternating periods of growth (xe2x80x9canagenxe2x80x9d), regression (xe2x80x9ccategenxe2"} +{"output_text": "-aminobenzoic acid (PABA), an essential precursor for the synthesis of the amino acids lysine and arginine. CVD908 also contains a deletion mutation in the purA gene, which encodes a phosphoribosylaminoimidazole carboxylase (AICAR transformylase) required for the synthesis of purine nucleotides. Purine nucleotides are required for the synthesis of DNA and RNA.\n1.2.3 Attenuated Salmonella typhi as a Live Vector Strain for Use in", "input_text": ", Vibrio cholerae), commensals (e.g., Lactobacillus, Streptococcus gordonii) and licensed vaccine strains (e.g., BCG). S. typhi is a particularly attractive strain for human vaccination.\n1.2.2 Attenuated Salmonella typhi as a Live Vector Strain\nS. typhi is a well-tolerated live vector that can deliver multiple unrelated immunogenic antigens to the human immune system. S. typhi live vectors have been shown to elicit antibodies and a cellular immune response to an expressed antigen. Examples of antigens successfully delivered by S. typhi include the non-toxigenic yet highly immunogenic fragment C of tetanus toxin and the malaria circumsporozoite protein from Plasmodium falciparum. \nS. typhi is characterized by enteric routes of infection, a quality which permits oral vaccine delivery. S. typhi also infects monocytes and macrophages and can therefore target antigens to professional APCs.\nExpression of an antigen by S. typhi generally requires incorporation of a recombinant plasmid encoding the antigen. Consequently, plasmid stability is a key factor in the development of high quality attenuated S. typhi vaccines with the ability to consistently express foreign antigens.\nAttenuated S. typhi vaccine candidates for use in humans should possess at least two well separated and well defined mutations that independently cause attenuation, since the chance of in vivo reversion of such double mutants would be negligible. The attenuated vaccine candidate S. typhi CVD908 possesses such properties. CVD908 contains two non-reverting deletion mutations within the aroC and aroD genes. These two genes encode enzymes critical in the biosynthetic pathway leading to synthesis of chorismate, the key precursor required for synthesis of the aromatic amino acids phenylalanine, tyrosine, and tryptophan. Chorismate is also required for the synthesis of p"} +{"output_text": "\n2. Description of the Related Art\nA hybrid vehicle is a vehicle driven by an engine and a motor. The hybrid vehicle is driven by the engine and the motor, and is capable of traveling by the engine alone or by the engine and the motor.\nThe hybrid vehicle is driven by the engine and the motor, and is capable of traveling by the engine alone or by the engine and the motor.\nThe hybrid vehicle is driven by the engine and the motor, and is capable of traveling", "input_text": "\nFrom equations (1) through (7), we have an equation (8) which expresses the system errors as follows: EQU W.sub.L (x, y)-W.sub.r (x, y)={W.sub.1 (x, y)-W.sub.2 (-x, -y) +W.sub.3 (x, y)+W.sub.3 (-x, -y)}/2 (8)\nwhere W.sub.2 (-x, -y) indicates that the address of the measurement data W.sub.2 obtained in the second measuring operation is rotated about the optical axis by 180 degrees.\nWhen the surface shape other than the spherical surface, e.g., the rotation asymmetric surface shape of the object to be measured, such as a cylindrical lens or a cylindrical mirror, is to be measured by the interferometer shown in FIG. 1, a cylindrical wavefront must be generated using a cylindrical lens in the condenser lens.\nIn that case, the light reflected by the object to be measured in the system indicated by FIG. 2 (C) returns to the condenser lens as the light wavefront which is symmetric with respect to the focusing line.\nTherefore, the equation (7) is transformed to EQU W.sub.C (x, y)=W.sub.i (x, y)+{W.sub.L (x, y)+W.sub.L (x, -y)}/2 (9)\nConsequently, the system errors cannot be expressed by the equation (8). 1. Field of the Invention\nThe present invention relates to a control apparatus for a hybrid vehicle driven by an engine and a motor. In particular the invention relates to a control apparatus for a hybrid vehicle, which regulates the deceleration regeneration amount by a motor according to traveling conditions."} +{"output_text": " processed by the image data processing part 132, and the dimension of the pattern is calculated by the pattern dimension measuring system 110.\nIn the pattern dimension measuring system 110, the pattern dimension is calculated by the following equation (1).\n[Equation 1]\nD=L/Sxe2x80x83xe2x80x83(1)\nwhere D is the pattern dimension, L is the length of the pattern, and S is the width of the pattern.\nIn the", "input_text": " the image data processing part 132, to suitably store the calculated results in a memory 14.\nReferring to FIG. 2, an example of a sequence for measuring the dimensions of a pattern, which is formed on the surface of the wafer 5 using the pattern dimension measuring system 110 shown in FIG. 1, will be described below. FIG. 2 is a schematic diagram showing the moving direction of the X-Y stage 3. In this example, the stage 3 is designed to move from a measurement start position Ps to a measurement end position Pe while drawing a locus shown by the dotted line in FIG. 2.\nFirst, by the sample conveyance system 12, the wafer 5 is conveyed into the vacuum sample chamber 2 to be mounted on the upper surface of the X-Y stage 3.\nThen, global alignment marks {circle around (1)} and {circle around (2)}, which are formed on the surface of the wafer 5 at substantially center and peripheral portion thereof, respectively, are used to carry out the global alignment to calculate a correlation between a pattern layout coordinate system and a stage coordinate system on the wafer 5.\nThen, the stage 3 is moved so that the position of a target pattern to be measured, e.g., the vicinity of pattern {circle around (3)} shown in FIG. 2, is a position irradiated with the electron beam 96, and stopped at this position. Then, the exciting current of the objective lens 103 is controlled so that the edges of the target pattern are within a beam focal depth by the automatic focus. Then, while the stage 3 is moved again in the direction of the dotted line arrow in FIG. 2, the electron beam 96 is scanned on the pattern {circle around (3)} to detect secondary electrons and so forth, which are emitted from the surface of the wafer 5, by unit of the secondary electron detector 31. The detected signal is data"} +{"output_text": " called nodules. The bacteria provide the plant with a source of nitrogen in the form of ammonia, which is then assimilated by the plant.\nThe nitrogen-fixing bacteria are able to live in the soil because they are able to form a symbiotic relationship with the plant. The bacteria are able to live in the soil because they are able to form a symbiotic relationship with the plant. The bacteria are able to live in the soil because they are able to form a symbiotic relationship with the", "input_text": " embedded in the same wavelength as the data payload prior to inputting the data payload at an input one of the nodes to produce an optical signal, each of the headers having a format and protocol and conveying multicast information indicative a local route through the given node for the data payload and the headers, the format and protocol of the data payload being independent of the format and protocol of the headers; (c) a detector for detecting the multicast information at the given one of the nodes to determine two switch control signals with reference to the multicast information as the data payload and the headers propagate through the optical network; (d) an optical splitter for splitting the optical signal into two split optical signals; (e) a selector for selecting two local routes through the given one of the nodes in correspondence to the two switch control signals; (f) an optical switch having input ports and output ports wherein one of the split optical signals couples to a first input port and the second of the split optical signals couples to a second input port, and wherein one of the outgoing links couples to a first output port and the second of the outgoing links couples to a second output port; and (g) a switch controller, coupled to the optical switch and responsive to the two switch control signals, for switching the optical switch in response to the multicast information to optically couple the first input port with the first output port and the second input port with the second output port. Nitrogen is an essential element for plant growth. A number of plants such as the legumes (soybean, peas, beans, alfalfa, clover, peanut, black locust, etc.) and various woody angiosperms (alder, casuarina, etc.) are able to provide their nitrogen requirements by forming a symbiotic association with certain soil bacteria. The bacteria live within root (or in some species, stem) structures"} +{"output_text": " inverted suspensions are suited for absorbing large bumps and tend to bottom out on small bumps. Furthermore, inverted suspensions are more efficient than conventional suspensions because the inverted suspension is able to absorb shocks before they reach the hands of the rider.\nIn addition to the above-mentioned advantages, inverted suspensions are also more efficient than conventional suspensions because the inverted suspension is able to absorb shocks before they reach the hands of the rider. Inverted suspensions are also more efficient than conventional suspensions because the inverted suspension is able to absorb", "input_text": " specifically, to a novel front wheel suspension fork for motorcycles or bicycles.\n2. Background Information\nPresently, motorcycle suspensions are of two general types called the conventional type or the inverted type. Conventional type suspensions consist of a damping mechanism--for example, a combination spring, rod, and hydraulic assembly--encased in two hollow cylinders that telescope into each other. The two hollow cylinders are of different diameters so that one cylinder telescopes snugly into the other. The primary damping mechanism--the receiving tube and valving assembly--is placed in the cylinder with the larger diameter and is located at the bottom of the suspension so that shocks transferred from the wheel to the suspension can be immediately damped before traveling to the hands of the rider. Any remaining compression force not absorbed by the damping mechanism are transmitted through the rod to the hands of the rider. In actual practice, conventional suspensions are suited for absorbing small bumps and tend to bottom out on large bumps. Furthermore, these conventional suspensions tend to deform under large stresses, causing the entire suspension to flex, thus decreasing its efficiency to absorb shocks. Inverted suspensions also consist of a damping mechanism encased into two hollow cylinders. As with conventional suspensions, the primary damping mechanism is located in the lower cylinder. However, unlike the conventional suspension, this lower cylinder is the cylinder with the smaller diameter. In fact, an observer can easily distinguish a conventional suspension from an inverted suspension because, in a conventional suspension, the cylinder having the larger diameter is attached to the wheel axis while in an inverted suspension, the cylinder having the smaller diameter is attached to the wheel axis. This configuration-one with the cylinder having a smaller diameter at the bottom and the cylinder having a larger diameter at the top causes the suspension to be rigid and stiff due to the increased length and increased stiffness of the upper cylinder. This increased rigidity solves the deformation problem encountered in conventional forks. In practice,"} +{"output_text": "\u03bb\u2003\u2003(1)\nwhere \u03b8 is an angle of incidence, and \u03bb is a wavelength of the EUV light.\nThe multilayer mirror is formed by alternately stacking two kinds of materials having different optical constants. The multilayer mirror is formed by alternately stacking two kinds of materials having different refractive indices. The multilayer mirror is formed by alternately stacking two kinds of materials having different optical constants. The multilayer mirror is formed by alternately stacking two kinds of materials having different", "input_text": " or dioptric system that utilizes refractions of the light used for the visual light and the UV light is no longer practical. Accordingly, the EUV exposure apparatus uses a reflection type optical element (catoptric system) that utilizes reflections of the light. The reflection type optical element in the EUV exposure apparatus includes a grazing-incidence total-reflection mirror and a multilayer mirror.\nIn the EUV region, a real part of the refractive index is slightly smaller than 1, and generates total reflection if the incident angle is made so large that the EUV light is incident close to the reflection surface. A grazing-incidence total-reflection mirror can usually maintain its reflectance of 80% or higher for obliquely incident light within scores of degrees from the reflection surface. Since the grazing-incidence total-reflection mirror has a small design freedom and leads to a large optical system, use of the mirror is impracticable.\nAccordingly, the reflection type optical element in the EUV exposure apparatus uses a multilayer mirror that alternately layers two kinds of materials, such as molybdenum (Mo) and silicon (Si), having different optical constants. A sum of two layer thicknesses of two kinds of materials is generally referred to as a layer thickness.\nThe multilayer mirror can be used at an incident angle close to the normal incidence and maintains a high reflectance. The multilayer mirror reflects, when receiving the EUV light, the EUV light with a specific wavelength, exhibiting the wavelength selectivity. For example, where is an incident angle, is a wavelength of the EUV light, d is a layer thickness, and m is an order, the efficiently reflected EUV light is the one having a narrow bandwidth around the wavelength as the center which approximately satisfies the Bragg's equation as Equation 1 below:2 d\u00d7cos \u03b8=m\u00d7"} +{"output_text": " is caused by the fact that the angle sensor is not calibrated correctly. The second error-causing component is a change in the angle sensor's output signal caused by the fact that the angle sensor is not calibrated correctly.\nThe first error-causing component can be eliminated by calibrating the angle sensor. The second error-causing component can be eliminated by calibrating the angle sensor. However, the calibration of the angle sensor is a time-consuming process.\nThe object of the invention", "input_text": " in U.S. Pat. No. 5,011,780. This approach yields hatch rates of greater than 25%; however it is also labor intensive.\nIt would be desirable to provide an improved method for increasing the hatchability of eggs subjected to manipulation. The invention relates to a method for determining a rotor position angle of a synchronous machine. A synchronous machine generally consists of a stator provided with three-phase winding and a magnetised rotor. The rotor is typically magnetised either by means of permanent excitation or separate excitation. In permanent excitation the rotor is provided with permanent magnet blocks, which the magnetic field produced in the stator pulls towards itself, thereby rotating the rotor. Separate excitation of the rotor means that the rotor contains coils of wire to which current is supplied. The coils of wire thus form magnetic poles in the rotor, the poles functioning according to the same principle as poles made of permanent magnets. In addition, the rotor of the synchronous machine may be a salient-pole rotor or a cylindrical rotor. In cylindrical rotor machines the rotor inductance remains almost constant with respect to the stator, whereas in salient-pole machines, the rotor inductance varies greatly due to changes in the air gap between the rotor and the stator, depending on the rotor position angle.\nIn speed-controlled synchronous machines, it is important for the functioning of the control system that the position angle of the machine's rotor is known as precisely as possible. Particularly in control methods based on direct control of the machine's stator flux the accuracy of angle determination has a great influence on the accuracy of the control. The rotor position angle is usually determined using a pulse encoder or an absolute sensor the information supplied by which allows the rotor angle to be determined.\nThe measurement result obtained from the angle sensor contains errors caused at least by two different components that can be determined. The first known error-causing component is an incorrect initial angle, which"} +{"output_text": " is reduced.\nFurther, JP-A-2002-237720 discloses a technology of achieving excellent communication by using 4 loops of a transmitting antenna and minimizing a nondetecting region produced at a portion of intersecting loop antennas. However, when a receiving antenna can be a related-art rectangular shape is provided to increase a receiving function of the antenna apparatus, there is a case in which a transmitting function is reduced.\nFurther, JP-A-2002-237720 discloses a technology", "input_text": " of a transmitting antenna 91 and on an inner side of the transmitting antenna 91, respective magnetic fluxes generated by the transmitting antenna 91 are made to pass a first through a third magnetic flux passing region S1, S2, S3 on the inner side of the receiving antenna 92 in correspondence with the first through the third loop antennas 91a through 91c. At this occasion, a direction of magnetic fluxes in the first and the third magnetic flux passing regions S1, S3 is inverse to a direction of magnetic fluxes of the second magnetic flux passing region S2.\nWhen the magnetic fluxes of the respective magnetic flux regions S1, S2, S3 are respectively designated by notations \u03c61, \u03c62, \u03c63, a total \u03c6 of the fluxes passing the first through the third magnetic flux passing regions S1, S2, S3 of the receiving antenna 92 becomes \u03c61\u2212\u03c62+\u03c63.\nNormally, a relationship of a degree of canceling when the magnetic fluxes passing the magnetic flux passing regions of the receiving antenna 92 are canceled by each other is not \u03c61+\u03c63=\u03c62. Therefore, the total \u03c6 of the magnetic fluxes is not nullified. Therefore, a current is induced in the receiving antenna 92, a current flowing in the transmitting antenna 91 is consumed to reduce by the receiving antenna 92 and thus a transmitting function is reduced. When such an inappropriate coupling cannot completely be canceled, a communicating function is reduced and a region of detecting the RF tag is narrowed.\nFurther, JP-A-2002-237720 discloses a technology of achieving excellent communication by using 4 loops of a transmitting antenna and minimizing a nondetecting region produced at a portion of intersecting loop antennas. However, when a receiving antenna can be a related-art rectangular shape is provided to increase a receiving function of the antenna apparatus, there is a case in which a transmitting function"} +{"output_text": " by overlapping the cross-sectional images.\nHowever, the manual input of the trajectory of the dental arch in the cross-sectional image is very difficult. In addition, the manual input of the trajectory of the dental arch in the cross-sectional image is not accurate. Thus, the panoramic image is not accurately reconstructed.\nIn addition, when a panoramic image is reconstructed by overlapping the cross-sectional images, the panoramic image is not accurately reconstructed.\nIn addition, when a panor", "input_text": " medical images corresponding to the selected at least one cross-section and the at least one cross-section adjacent to the selected cross-section. However, this approach cannot acquire a 2D medical image (i.e. a panoramic image or a cephalometric image in the case of a 2D dental image) by reconstructing image information in a specific area of paths along which X-rays are emitted (hereinafter, referred to as X-ray emission paths) from the sagittal or coronal cross-section.\nSince 2D medical images corresponding to different cross-sections, which are spaced apart at least predetermined distances, are magnified at different degrees due to the fact that ultrasonic beams or X-ray beams tend to propagate radially, when a 2D medical image is acquired by overlapping medical images in a selection area (range) using a simple image synthesizing method used in the related art, the acquired 2D medical image becomes inaccurate.\nWhen a CT operator is not experienced, the CT operator frequently takes CT images without accurately aligning with the position of a patient (subject). When CT images are taken with the position of the patient being misaligned, it is impossible to prevent horizontal asymmetry or distortion in the facial skeleton, unless CT geometry correction is performed. Thus, when a CT image acquired from a patient without inaccurately aligning the position of the patient is displayed through a medical diagnostic 3D viewer, the initial CT image appears distorted. In addition, when features are extracted from CT volume data or a panoramic image or a cephalometric image is automatically reconstructed without executing a CT geometry correction algorithm, the performance of a reconstruction algorithm is lowered, which is problematic.\nIn the related art of a dental computed tomography, a panoramic image is reconstructed manually. A user manually inputs the trajectory of the dental arch in a cross-sectional image -of the axial direction. And the panoramic image is generated"} +{"output_text": " documents are considered material to the patentability of the claims of the present application. All statements as to the date or representations as to the contents of these documents are based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer wiring structure.\nIn recent years, the degree of integration", "input_text": " which have been cultured in an electromagnetic field to produce a product that has an anti-cancer effect. The following are some examples of prior uses of yeast cells and components thereof:\nU.S. Pat. No. 6,197,295 discloses a selenium-enriched dried yeast product which can be used as dietary supplement. The yeast strain Saccharomyces boulardii sequela PY 31 (ATCC 74366) is cultured in the presence of selenium salts and contains 300 to about 6,000 ppm intracellular selenium. Methods for reducing tumor cell growth by administration of the selenium yeast product in combination with chemotherapeutic agents is also disclosed.\nU.S. Pat. No. 6,143,731 discloses a dietary additive containing whole \u03b2-glucans derived from yeast, which when administered to animals and humans, provide a source of fiber in the diet, a fecal bulking agent, a source of short chain fatty acids, reduce cholesterol and LDL, and raises HDL levels.\nU.S. Pat. No. 5,504,079 discloses a method of stimulating an immune response in a subject utilizing modified yeast glucans which have enhanced immunobiologic activity. The modified glucans are prepared from the cell wall of Saccharomyces yeasts, and can be administered in a variety of routes including, for example, the oral, intravenous, subcutaneous, topical, and intranasal route.\nU.S. Pat. No. 4,348,483 discloses a process for preparing a chromium yeast product which has a high intracellular chromium content. The process comprises allowing the yeast cells to absorb chromium under a controlled acidic pH and, thereafter inducing the yeast cells to grow by adding nutrients. The yeast cells are dried and used as a dietary supplement.\nCitation of documents herein is not intended as an admission that any of the documents cited herein is pertinent prior art, or an admission that the cited"} +{"output_text": "ol is selected from the group consisting of acetone, ethyl acetate, ethyl acetate-hexane, ethyl acetate-pet.ether, ethyl acetate-hexane or any mixture thereof. The triphenyl phosphine halide used in step (g) is selected from the group consisting of triphenyl phosphine chloride, triphenyl phosphine bromide, triphenyl phosphine iodide and triphenyl phosphine sulphide.\nThe isolated (xe2x88x92) 3,4-divanilly", "input_text": ") concentrating the organic solvent to a residue and crystallizing it from a suitable organic solvent or mixtures of such solvents to get (xe2x88x92) secoisolariciresinol of formula (1).\nThe isolated (xe2x88x92) secoisolariciresinol of formula (1) is converted into (xe2x88x92) 3,4-divanillyl tetrahydrofuran of formula (2) by (f) dissolving the isolated (xe2x88x92) secoisolariciresinol in an organic solvent, reacting it with triphenyl phosphine halide at 0-80xc2x0 C. for 1-10 hours and (g) isolating (xe2x88x92) 3,4-divanillyl tetrahydrofuran of formula (2) by column chromatography.\nThe alcohol used in step (a) can be an alkanol such as methanol and ethanol. The chlorinated solvent in step (b) is selected from the group consisting of chloroform and dichloromethane. The base used in step (c) is selected from the group consisting of sodium hydroxide, potassium hydroxide and lithium hydroxide.\nThe organic solvent used in step (c) is selected from the group consisting of toluene, chloroform, dichloromethane, ethyl acetate. The mineral acid in step (d) is selected from the group consisting of hydrochloric acid, sulphuric acid. The suitable organic solvent or mixtures of such solvents used in step (e) comprises acetone, and mixtures of acetone-petroleum ether acetone-hexane, ethyl acetate-pet.ether, ethylacetate-hexane or any mixture thereof.\nThe organic solvent used in step (f) for dissolving (xe2x88x92) secoisolariciresin"} +{"output_text": ".\nPreferably the apparatus comprises a plurality of support points for defining the position of the mask perpendicular to its plane, that is in the x, y and Rx directions. The, or each, member defines the x, y and Rz position of the mask, and support points define the remaining position of the mask without distorting it.\nPreferably the apparatus comprises a plurality of support points for defining the position of the mask perpendicular to its plane, that is in the x, y and", "input_text": " for holding a substrate;\na projection system for imaging irradiated portions of the mask onto target portions of the substrate; wherein:\nsaid mask table comprises at least one compliant member for holding said mask such that said at least one compliant member yields to conform substantially to the profile of the mask.\nThe use of at least one compliant member enables the mask to be held, but without unwanted deformation by forcing it to adopt a particular shape. The member can yield to accommodate flatness variations in the mask. The member, preferably a membrane, has a stiffness in the xy plane such that thermal expansion of the mask can be accommodated by the flexibility of the member, but without slipping of the mask with respect to the member. Slipping is detrimental to overlay precision, more so than thermal expansion, because of its asymmetric occurrence.\nBy appropriate choice of material and thickness of the member, its stiffness can be determined such that any particles trapped between the mask and member will preferentially deform the member rather than the mask. This can reduce deflection of the mask caused by a contaminant particle by a factor of as much as 10,000 compared to conventional mask clamping arrangements.\nPreferably said at least one member comprises a pair of parallel strips, each of which is supported along its length. This improves the stiffness of the member against sagging, and reduces material creep.\nPreferably the apparatus comprises a recess in the member which can function as a vacuum space for holding the mask and the member against each other (see for example FIG. 4). This arrangement is both secure and compact.\nPreferably the apparatus comprises a plurality of support points for defining the position of the mask perpendicular to its plane, that is in the z, Rx and Ry directions. The, or each, member defines the x, y and Rz position of the mask, and support points define the remaining position of the mask without distorting it"} +{"output_text": ". The balancing of the circuit is difficult to achieve and requires a high degree of precision.\nThe present invention is directed to a method and apparatus for providing a digital loop carrier system which is compatible with existing digital loop carrier systems.\nThe present invention is also directed to a method and apparatus for providing a digital loop carrier system which is compatible with existing digital loop carrier systems.\nThe present invention is further directed to a method and apparatus for providing a digital loop carrier system which is compatible with existing digital", "input_text": "A block level representation of this test system architecture is shown in FIG. 1C. As shown therein, the MCU unit is coupled to the PGTC by the Channel Test Unit (CTU) which gives Tip and Ring connections to the metallic channel unit. A corresponding MCU is provided at the remote end of the digital loop carrier and the Tip and Ring connections at the remote end of the system are emulated at the central office end. Hence, the emulated wire pair connection causes a load appearing at the remote end of the system to be connected via a wire pair to the central office end of the system and vice versa.\nOne disadvantage of the metallic channel unit is that it is specific to the particular digital loop carrier which is being utilized. Each type of DLC has its own proprietary interface for the metallic channel unit. This requires the telephone company to maintain an inventory of multiple types of MCU's. For example, in the MCU 4496 product information sheet (Issue 3B, list 1), the version of the MCU for the AT&T SLC96 DLC is described. In each case, the MCU unit must be integrated into the COT as well as the RT, as illustrated in FIG. 1C. Another disadvantage is that the MCU requires two DS0 digital channels to link the CO end with the remote end.\nAnother difficulty with the MCU is in the metallic emulation function. FIG. 2 is a reproduction of FIG. 2 of U.S. Pat. No. 5,457,743 (the '743 patent) which shows an equivalent circuit of the MCU. As shown in that patent, the two ends of the tip line connection are essentially identical in their implementation. The practical difficulty in implementing this system is in balancing the opposite sides of the circuit to make the system appear as a piece of cable with very high DC impedance"} +{"output_text": "ane is oxidized in a pressured reactor in the presence of a solubilized molybdenum catalyst to provide a mixture of tertiary butyl alcohol, tertiary butyl hydroperoxide, methanol, acetone, and other oxygen-containing compounds. The tertiary butyl hydroperoxide is thermally decomposed under pressure at about 280.degree. F. to provide a tertiary butyl alcohol product containing only residual quantities of tertiary butyl hydroperoxide which are then decomposed in accordance with Grane U. S", "input_text": ".\nIn Massie U. S. Pat. No. 3,775,472 a process is disclosed wherein alkyl substituted aromatic hydrocarbons are oxidized to products such as aromatic alcohols, aldehydes and carboxylic acids in the presence of ruthenium compounds.\nGrane U.S. Pat. No. 3,474,151 discloses that tertiary butyl alcohol starts to dehydrate at 450.degree. F. and to decompose at a \"rapid rate\" at temperatures above 475.degree. F. Grane discovered, however, that residual quantities of hydroperoxide contaminants present in tertiary butyl alcohol could be thermally decomposed by heating the contaminated tertiary butyl alcohol at a temperature of 375.degree. to 475.degree. F. for about 1 to 10 minutes.\nGrane et al. U. S. Pat. No. 4,294,999 discloses a process wherein isobutane is oxidized in a pressured reactor in the presence of a solubilized molybdenum catalyst to provide a mixture of tertiary butyl alcohol, tertiary butyl hydroperoxide, methanol, acetone, and other oxygen-containing compounds. The tertiary butyl hydroperoxide is thermally decomposed under pressure at about 280.degree. F. to provide a tertiary butyl alcohol product containing only residual quantities of tertiary butyl hydroperoxide which are then decomposed in accordance with Grane U. S. Pat. No. 3,474,151 by heating the tertiary butyl alcohol at 375.degree. to 475.degree. for about 1 to 10 minutes. Heating tertiary butyl alcohol containing small amounts of peroxides at high temperatures for even short periods of time to remove the peroxides produces undesirable products such as isobutylene.\nGrane et al. U. S. Pat. No. 4,296,262 discloses a related process wherein isobut"} +{"output_text": "\nThe invention relates to a method for the production of a semiconductor device, and more particularly to a method for the production of a semiconductor device having a multilayer wiring structure.\nIn recent years, the degree of integration of semiconductor devices has been increased, and the number of layers of wiring has been increased. In order to increase the number of layers of wiring, it is necessary to reduce the thickness of the interlayer insulating film. However, when the thickness of the interlayer insulating film is reduced", "input_text": " at a defined rate, a token counter for each bucket provided for each buffer or queue, and computational logic for deducting tokens from the counter of each bucket according to the size of each transferred packet, etc.\nHowever, in such a system, smaller token units require faster computational logic for deducting tokens from the counter of each bucket. Further, when there are a plurality of queues, a mechanism for supplying tokens at a rate based on each guaranteed minimum bandwidth and a mechanism for counting tokens for each queue are required. Such a system would be complicated and large in size.\nFor each user, relay priority is a matter of relative preference. Accordingly, in order to use a guaranteed bandwidth efficiently, it is desirable that packets with low relay priority should be transferred up to the guaranteed minimum bandwidth unless there is another packet having higher relay priority at the same time. Furthermore, packets having higher relay priority should be preferentially allocated as traffic within the guaranteed minimum bandwidth without being affected by another packet having lower relay priority. If packets having high relay priority are received in excess of the guaranteed minimum bandwidth, the high relay priority packets should be treated as a best-effort traffic and may be marked for preferential discard, even though the relay priority is high. Determination of the mechanical behavior physical properties of materials is necessary so that materials may be selected for use, evaluated when in use, and evaluated after use. From these determinations, decisions are made as to which materials to use, the conditions under which they can be used, and whether such materials in use can be continued to be used with safety. These types of determinations are particularly useful for determining the effects of environmental loading such as nuclear radiation on the mechanical properties of in-service materials. This invention is fully applicable to the determination of mechanical behavior of such materials but is also applicable to materials not subjected to radiation and the validity of the invention was demonstrated for materials not subjected to radiation."} +{"output_text": " the toner is not easily removable from the paper.\nInk jet printing is also not a viable alternative for high speed variable printing at present, because the ink coverage and water saturation is increased. This is due to the four color process that is used to generate color images. Four color processing involves laying cyan, magenta, yellow and black (i.e., CMYK) ink in varying amounts to make any color on the page. Thus, some portions of the page may have as many", "input_text": " to create a bubble. The expanding bubble causes a droplet to form, and the droplet is ejected from the print head. Piezoelectric technology uses a piezo crystal located at the back of each ink reservoir. Electric charges are used to cause vibrations in the crystals. The back and forth motion of the crystal is able to draw in enough ink for one droplet and eject that ink onto the paper.\nThe quality of color ink jet printing is generally orders of magnitude lower than that of offset lithography and gravure. Furthermore, the speed of the fastest ink jet printer is typically much slower than a lithographic or gravure press. Traditional ink jet printing is also plagued by the effect of placing a water-based ink on paper. Using a water-based ink may saturate the paper and may lead to wrinkling and cockling of the print web. In order to control these phenomena, ink jet printers use certain specialized papers or coatings. These papers can often be much more expensive than a traditional web.\nFurthermore, when ink jet technology is used for color printing, the ink coverage and water saturation is increased. This is due to the four color process that is used to generate color images. Four color processing involves laying cyan, magenta, yellow and black (i.e., CMYK) ink in varying amounts to make any color on the page. Thus, some portions of the page may have as many as four layers of ink if all four colors are necessary to produce the desired color. Additionally, the dots produced by an ink jet printer may spread and produce a fuzzy image.\nLaser printing does not appear to be a viable alternative for high speed variable printing at present, because production speeds are still much slower than offset and gravure, and the material costs (e.g., toner, etc.) are extremely high. Laser color is also difficult to use for magazines and other bound publications, because"} +{"output_text": " stopping is undesirable, since it leads to a lower yield of properly functioning devices.\nThe present invention provides a method and apparatus for improving the uniformity of the plasma etch process by providing a more uniform etchant ion/radical distribution over the wafer surface. The invention is particularly useful for improving the uniformity of the plasma etch process in a plasma reactor having a ceiling with a dielectric material, such as quartz, and a coil overlying the ceiling and facing the wafer being processed.\nThe invention is based", "input_text": " gases are fed through the ceiling directly over the workpiece.\n2. Background Art\nThe inductively coupled plasma reactor disclosed in U.S. Pat. No. 4,948,458 has a planar coil overlying the chamber ceiling and facing the semiconductor wafer being processed, thereby providing an optimally uniform RF induction field over the surface of the wafer. For this purpose, the ceiling, which seals the reactor chamber so that it can be evacuated, must be fairly transmissive to the RF induction field from the coil and is therefore a dielectric, such as quartz. It should be noted here that such a ceiling could be made from dielectric materials other than quartz, such as aluminum oxide. However other materials such as aluminum oxide tend produce greater contamination than quartz due to sputtering.\nPolymerization during a plasma etch process requires a careful balance of etchant and polymer, the etchant concentration typically being at a depletion level to avoid inhibition of appropriate polymer formation. As a result, a significant proportion of etchant ions and radicals formed near the wafer periphery are consumed before reaching the wafer center, further depleting the etch ion concentration over the wafer center. This leads to a lower etch rate or etch stopping near the wafer center.\nOne reason that there are more ions at the wafer periphery is that introduction of the etchant precursor gas from the side can produce a non-uniform etchant ion/radical distribution favoring the side. Many of the etchant ion/radical-forming energetic electrons generated near the side are lost to collisions with other species before reaching the wafer center, thus reducing the etchant ion concentration at the wafer center. The relative lack of etchant ions near the wafer center permits faster formation of polymer at the wafer center, so much so that in some cases the polymer formation overwhelms the etch process and stops it, particularly at feature sizes less than 0.5 microns. Such etch"} +{"output_text": " and a base station controller (BSC) will be an IP based wireless link. The BTS and BSC will be connected by a wireless link. The BTS and BSC will be connected by a wired link. The BTS and BSC will be connected by a wired link. The BTS and BSC will be connected by a wired link. The BTS and BSC will be connected by a wired link. The BTS and BSC will be connected by a wired link", "input_text": " device from the transmission document, wherein the general purpose document includes a plurality of document elements, and the piece of device specification information corresponding to the specification of received document processing device includes document element selection information on each document element necessary for the specification of received document processing device.\nSuch an information providing server device may convert a transmission document that the transmission document edition device has edited into another transmission document suitable for a mobile communication terminal of one specification. As a result, the load of the specification of mobile communication terminal may be reduced.\nThe present invention may be realized by a computer-readable storage medium for storing a program that realizes a function of editing a transmission document that is to be transmitted to received document processing devices of a plurality of specifications from a general purpose document described in a markup language, wherein the program realizes: a device specification information obtaining unit for obtaining a plurality of pieces of device specification information each piece of which a received document processing device of a different specification refers to when processing the general purpose document according to marks in the markup language; and a transmission document creation unit for creating the transmission document in which the general purpose document that is described in the markup language and the plurality of pieces of device specification information that the device information obtaining unit has obtained are related to each other.\nAs a result, a transmission document edition device that edits a transmission document to be transmitted to the received document processing devices of the plurality of specifications from a general purpose document described in a markup language and a plurality of pieces of device specification information for the plurality of specifications of mobile communication terminal may be realized. The invention relates generally to Internet Protocol (IP) packet communication methods and apparatus and more particularly to point to point protocol (PPP) packet communication methods and apparatus.\nInternet Protocol (IP) based wireless communication architectures are known. As shown in FIG. 1, in the future the link between a base transceiver station (BTS)"} +{"output_text": " can be drawn in a short time. However, since the circle drawing method disclosed in Japanese Unexamined Patent Publication No. Hei 4-52776 is a method of drawing a circle by sequentially determining the coordinates of the points on the circumference of the circle, the circle drawing method disclosed in Japanese Unexamined Patent Publication No. Hei 4-52776 is not suitable for drawing a circle having a complicated shape.\nIn order to draw a circle having a complicated shape, a method of", "input_text": ".4) is not positive, the Y coordinate is increased by one and the X coordinate is decreased by one from the point 203.\nSimilarly, f(Z.sub.5) is derived. Depending upon the sign of f(Z.sub.5), the coordinate of the next point 205 is determined. In this case, as shown in FIG. 2E, since the sign of f(Z.sub.5) is positive, the Y coordinate is increased by one from the point 204, and the X coordinate is held unchanged. Then, f(Z.sub.6) is derived, and the coordinate of the next point 6 is determined in response to the sign of f(Z.sub.6). In this case, as shown in FIG. 2F, since the sign of f(Z.sub.6) is not positive, the Y coordinate is increased by one and the X coordinate is decreased by one from the point 205. Furthermore, f(Z.sub.7) is derived, and the coordinate of the next point 7 is determined in response to the sign of f(Z.sub.7). In this case, as shown in FIG. 2G, since the sign of f (Z.sub.7) is not positive, the Y coordinate is increased by one and the X coordinate is decreased by one from the point 206. Then, since X=Y is established at the coordinate of the point 7, calculation of the coordinates of the one-eighth fraction of the circle is completed. It should be noted that, while not illustrated, points symmetric to respective points 201 to 207 with respect to the (Y=X) axis, the X-axis and/or the Y-axis are derived sequentially.\nBy the circle drawing method disclosed in Japanese Unexamined Patent Publication No. Hei 4-52776, a full circle"} +{"output_text": "2-amyloid peptide (A xcex2-peptide) which is a major component of the neuritic plaques found in AD (16).\nThe xcex2-peptide is a major component of the neuritic plaques found in AD and is derived from the xcex2-secretase cleavage of APP (17). The xcex2-secretase cleavage of APP is a complex process that involves at least two enzymes, xcex2-secretase and xcex3-secretase. The xcex2-secretase", "input_text": " animal model but also reduce the production of xcex2-amyloid which is one of the principal constituents of neuritic plaques (10).\nFrom a neuropathology perspective, deposition of amyloid and formation of NP is one of the central mechanisms in the evolution of AD (11, 12). However, amyloid plaques are also found in brains of elderly individuals who do not have dementia (13). It has been suggested that the amyloid plaques in individuals without dementia are xe2x80x9cbenignxe2x80x9d and they become xe2x80x9cmalignantxe2x80x9d, causing dementia, when they are transformed into plaques-containing degenerated neurites (13). These plaques are called neuritic plaques (NP). The mechanism of transformation from xe2x80x9cbenignxe2x80x9d to xe2x80x9cmalignantxe2x80x9d plaques is as yet unknown. It has been suggested that BuChE may play a major role in this-transformation based on the observation that BuChE is found predominately in plaques that contain dystrophic neurites and not in plaques without dystrophic neurites (13).\nTaken together these observations suggest that in brains of patients with AD there is a significant alteration of the biochemical properties of BuChE that alters its normal regulatory role in the brain thus contributing to the pathology of AD.\nRecently, a brain specific serine protease called trypsin IV has been isolated and it is presumed to be involved in APP processing (24). Amyloid precursor protein (APP) is a transmembrane glycoprotein, which possesses a Kunitz-type serine protease inhibitor domain. The APP may be involved in protease regulation in the brain (14, 15). Of particular importance is the fact that abnormally cleaved APP results in the formation of a 40-42 amino acid residue xcex"} +{"output_text": " is manufactured by rapid cooling of molten liquid metal, and thus has a high saturation flux density and a small core loss. However, the amorphous metal has a low permeability, and thus is not suitable for use in a high frequency large magnetic core.\nIn order to solve the above-mentioned problems, a method of manufacturing a magnetic core using a magnetic powder having a high permeability and a high saturation flux density has been proposed.\nThe magnetic powder is manufactured by a powder metallurgy method, and", "input_text": " components such as high-capacity inverters, coil parts of power sources, and distribution transformers.\nMeanwhile, the amorphous metal has a constituent atom having a disordered structure similar to the liquid state, and is manufactured by rapid cooling of molten liquid metal to thus represent various characteristics different from the existing crystalline materials, in particular, show excellent soft magnetic properties.\nThe amorphous metal is largely classified into iron (Fe)-based metal, cobalt (Co)-based metal, etc., depending on the main ingredient thereof. The Fe-based amorphous metal has a high saturation flux density and a small core loss when compared to those of the silicon steel sheet. Accordingly, the Fe-based amorphous metal is used in a large capacity pole transformer or in a high frequency large magnetic core. The Co-based amorphous metal has a high permeability, and a core loss and coercivity, and thus is used as a high frequency small magnetic core.\nMoreover, the amorphous metal has a small core loss and a small eddy current loss when compared to other soft magnetic materials, and thus has been highlighted as the soft magnetic material for magnetic cores on behalf of silicon steel sheets or ferrite. The amorphous metal is excellent in view of high-efficiency, high frequency characteristics due to eddy current losses such as large electrical specific resistivity, noise suppression characteristics by high permeability and high saturation flux density, DC bias characteristics, and responsiveness required for miniaturization.\nProducts with low core loss characteristics are choke cores, high-frequency transformers for use in inverters, distribution transformers, various reactors, etc. Products using high permeability characteristics are pulse transformers, step-up transformers, audio transformers, current transformers, noise filters, etc. In this case, magnetic cores are classified into a relatively small-capacity gap type toroidal shape core and a relatively large-capacity rectangular shape cut core.\nThe amorphous metal"} +{"output_text": " step size of the quantizer 58 so that the transmit buffer 68 never overflows.\nThe rate controller 80 also includes a buffer controller 82 to insure that the encoded data stream is transmitted at a fixed rate. The buffer controller 82 monitors the transmit buffer 68 to insure that the buffer 68 does not underflow. If the buffer 68 underflows, the buffer controller 82 increases the rate at which the encoded data stream is transmitted. Conversely, if the buffer 68 over flows, the buffer controller 82 decreases", "input_text": " these Y values from the pre-compression Y values of the matching macro block of the non-I frame. These differences, which are called residuals, are arranged in 8xc3x978 blocks and are processed by the DCT 56, the quantizer 58, the coder 66, and the buffer 68 in a manner similar to that discussed above, except that the quantized DC coefficients of the residual blocks are coupled directly to the coder 66 via the line 60, and thus are not predictively encoded by the prediction encoder 44.\nAdditionally, it is possible to use a non-I frame as a reference frame. When a non-I frame will used as a reference frame, the quantized residuals from the quantizer 58 are respectively dequantized and inverse transformed by the dequantizer 70 and the inverse DCT 72 so that this non-I reference frame will be the same as the one used by the decoder for the reasons discussed above. The motion predictor 78 provides to the summer 74 the decoded Y values of the I reference frame from which the residuals were generated. The summer 74 adds the respective residuals from the circuit 72 to these decoded Y values of the I reference frame to generate the respective Y values of the non-I reference frame. The reference frame buffer 76 then stores the non-I reference frame along with the I reference frame for use in encoding subsequent non-I frames.\nStill referring to FIG. 4, the encoder 50 also includes a rate controller 80 to insure that the transmit buffer 68, which typically transmits the encoded frame data at a fixed rate, never overflows or empties, i.e., underflows. If either of these conditions occurs, errors may be introduced into the encoded data stream. For example, if the buffer 68 overflows, data from the coder 66 is lost. Thus, the rate controller 80 uses feed back to adjust the quantization"} +{"output_text": " glands are the common glands present in the skin surface that produce a clear, watery secretion. Eccrine glands are found in the palms, soles, axillae, and genitalia. Eccrine glands are composed of a secretory portion and a ductal portion. The secretory portion is composed of a single layer of cells that produce the secretion by a merocrine mechanism. The ductal portion is composed of a single layer of cells that produce the secretion by a merocrine mechanism. The", "input_text": " from the gland and ducts by contraction of surrounding muscle-like myoepithelial cells. In some secretory glands, such as the mammary gland, increased expulsion has a feed back effect in stimulating further secretory production. In addition, the number and amount of secretory and myoepithelial cells can be modulated, with proportional changes in the amount of secretion produced. Finally the act of secretion is often accompanied by vascular dilation around the gland, which is believed to aid the gland by increased delivery of nutrients.\nSkin secretory cells produce their secretion by 3 basic mechanisms.\nApocrine glands are the common sweat glands present throughout the skin surface that produce profuse watery secretion. Apocrine glands have a simple organization; the gland is composed of a coiled duct in the dermis with an open end that discharges onto the skin surface. They produce a watery secretion that evaporates and cools the skin thereby playing a role in thermoregulation. Discharge of the secretion from the lumen of the ductal portion of the apocrine sweat gland is assisted by the action of myoepithelial cells which surround the secretory portion of the gland.\nThe cells lining the ducts of apocrine glands produce the secretion by a merocrine mechanism. This terminology is confusing as it would appear that this sweat glands should be called merocrine glands. However, the sweat glands were named before the exact mechanism of their cellular secretion was known, and their original names have persisted.\nExcessive sweating, formally known as hyperhydrosis, is a common condition. Hyperhydrosis can occur in any part of the body but primarily affects the forehead, axilla, palms and feet. Sanders and Shaari (U.S. Pat. No. 5,766,605) Walker (U.S. 20020086036) disclose a method of treating hyperhydrosis using needle and jet injections of BT.\nEccrine"} +{"output_text": " to convey information. The LSB is not used to convey information in the PCM codeword, but is used to convey information in the RBS codeword. The RBS codeword is a PCM codeword that is transmitted at a lower bit rate than the PCM codeword. The RBS codeword is used to convey information that is not contained in the PCM codeword. The RBS codeword is transmitted at a lower bit rate than the PCM codeword because the", "input_text": " corresponding linear 16-bit binary data. It can also be seen in FIG. 3 that the logarithmic function of the standard conversion format is approximated by a series of 8 linear segments.\nThe conversion from octet to analog voltage is well known, and as stated above, is based on a system called.mu.-law coding in North America and A-law coding in Europe. Theoretically, there are 256 points represented by the 256 possible octets, or.mu.-law codewords. The format of the.mu.-law codewords is shown in FIG. 4, where the most significant bit b.sub.7 indicates the sign, the three bits b.sub.6 -b.sub.4 represent the linear segment, and the four bits, b.sub.0 -b.sub.3 indicate the step along the particular linear segment. These points are symmetric about zero; i.e., there are 128 positive and 128 negative levels, including two encodings of zero. Since there are 254 non-zero points, the maximum number of bits that can be sent per signaling interval (symbol) is just under 8 bits. A.mu.-law or A-law codeword may be referred to herein as a PCM codeword. It is actually the PCM codeword that results in the DTN 20 codec to output a particular analog voltage. The codeword and the corresponding voltage may be referred to herein as \"points.\"\nOther factors, such as robbed-bit signaling, digital attenuation (pads), channel distortion and noise introduced by the subscriber loop, and the crowding of points at the smaller voltage amplitudes and the associated difficulty in distinguishing between them at the decoder/receiver, may reduce the maximum attainable bit rate. Robbed Bit Signaling (RBS) involves the periodic use of the least significant bit (LSB) of the PCM codeword"} +{"output_text": " of the data packets.\nIn the case where the number of data packets that are successfully received is larger than the number of ACK packets generated, the STA that is a sender of the data packets retransmits the data packets. In the case where the number of data packets that are successfully received is smaller than the number of ACK packets generated, the STA that is a sender of the data packets does not retransmit the data packets.\nIn the case where the number of", "input_text": "3.\nIn general, the propagation coefficients hxx are changed with time and are also changed by a change in a wireless channel such as fading, reduction in signal intensities, and the like. Moreover, when MIMO number is increased, an effect of the change in the wireless channel on the channel condition becomes large. That is, a packet error rate or a bit error rate becomes larger with the increase of the MIMO number. Therefore, the MIMO number is determined (limited) in accordance with the propagation coefficients and the like.\nWhen transmission of a data packet is unsuccessful, the receive side transmits a response packet indicating that failure or does not transmit any response packet. In this case, the transmitting side determines that transmission of the data packet is unsuccessful, and retransmits the data packet. However, retransmission of data packets simultaneously transmitted using MIMO is not specifically defined. Thus, a problem in the case where a conventional retransmission process is applied to such simultaneous transmission is now described.\nFIG. 15 shows a general processing on exchanging data packets. After a transmit-side STA transmits a data packet, a receive-side STA transmits an acknowledgement (hereinafter, ACK) packet for the received data packet, thereby giving notice of information about the ratio of successful receptions of data packets to total receptions in the past on the receive-side STA. That method for transmitting an ACK packet can be applied without change to a wireless packet communication method that uses MIMO. In this case, it is considered that a packet exchange sequence as shown in FIG. 16 is performed.\nAn STA receiving a plurality of data packets multiplexed by MIMO generates ACK packets. The number of those ACK packets is the same as the number of data packets that are successfully received. The thus generated ACK packets are sent back to an STA that is a sender"} +{"output_text": " with a lower cost opacification aid; and 4) to provide opacity to paper products that are not currently available in the market place. \nThe present invention provides a method for improving the opacity of paper and paperboard products by using a combination of a low cost opacification aid and a high cost opacification aid. The low cost opacification aid is a low cost, non-fouling, non-fiber-reactive, non-fiber-retent", "input_text": "The use of inorganic filler/pigments in papermaking usually requires these materials be made into an aqueous slurry dispersion, in which the filler/pigment slurry is applied and mixed with the aqueous fiber containing papermaking slurry prior to the papermachine. This also generally requires the inorganic filler/pigments to be made down into a workable slurry that can be stored, easily pumped and metered into the wet-end of the papermachine. Because the suspended, solid particles in these slurries have a tendency to settle, the filler/pigment slurries generally require constant agitation in their make-down tanks.\nAnother problem often associated with using inorganic filler/pigments in papermaking systems is their propensity to foul the papermachine wire and press felts. Fouling decreases the effectiveness of the papermachine to dewater the pulp slurry, thus requiring down time to clean and/or replace these papermachine equipment, and a resultant increase in the cost of producing the paper product.\nAs a result of the various problems identified above with using the inorganic filler/pigment based opacification aids, the papermaking industry is in need of new methods to improve and/or increase the opacity of various paper and paperboard grades whole optical properties are critical to their end-use functionality. Depending on the grade of paper to be made, the need for alternative opacification methods outside the use of mineral filler/pigments is being driven by the need to: 1) provide cost effective opacification to paper products without the aforementioned slurry handling, papermachine fouling and particle retention issues of inorganic fillers or pigments; 2) to provide equivalent opacity to current mineral or pigment filled sheets at a lower basis weight; 3) to improve the strength properties of high ash content grades of paper by replacing a portion of the mineral filler/pigment"} +{"output_text": " is connected to the embroidering machine, and the stitch data are read out from the external device and stored in a stitch data memory means of the embroidering machine. The embroidering machine is controlled by the stitch data stored in the stitch data memory means, and the embroidering operation is performed by the control data.\nIn the embroidering machine, the stitch data are read out from the external device and stored in the stitch data memory means, and the embroidering operation", "input_text": " data memory means for controlling the embroidering operation and, more particularly, to an embroidering machine capable of combining plural sets of stitch data separately recorded in an external recording device such as a paper tape into stitch data of a single continuous design and storing and reading this stitch data in and from the stitch data memory means.\nThe invention relates also to an embroidering machine capable of performing data editing such as change, deletion and insertion with respect to stitch data of an embroidery design stored in a memory at a desired stitch thereof and, more particularly, to an embroidering machine capable of temporarily stopping rotation of the embroidering machine main shaft at a stitch of the embroidery design in the course of the embroidering operation and executing data editing with respect to a desired stitch associated with the stitch at which the rotation has stopped thereby enabling data editing in a simple manner while confirming the actual embroidery design.\nThe invention relates also to an embroidering machine capable of performing data editing such as change or modification with respect to stitch data at a desired stitch in an embroidery design and, more particularly, to an embroidering machine capable of displaying a desired embroidery design on a screen of a display and changing or setting with respect to a desired stitch of the stitch data while confirming the embroidery design on the screen of the display.\nThe invention relates also to an embroidering machine having a communication interface and, more particularly, to an embroidering machine capable of transmitting and receiving stitch data for realizing the embroidering operation and respective control data for controlling the embroidering operation associated with this stitch data between the embroidering machine and an external device such as a computer having a communication function.\nIn an automatic embroidering machine, stitch data for various embroidery designs are stored in an external device such as a paper tape, the external device"} +{"output_text": " NodeB. The CSI-RS is transmitted over substantially the entire DL system BW, and can be used by UEs to perform measurements. The DMRS is transmitted over substantially the entire DL system BW, and can be used by UEs to transmit data or control signals.\nA NodeB transmits a PDSCH to a UE through a PDSCH Transmission Time Interval (TTI), which may be a sub-frame, a sub-frame including two slots, a sub-", "input_text": " such as, for example BSs or NodeBs. A UE, which is also commonly referred to as a terminal or a mobile station, may be fixed or mobile and may be embodied as a cellular phone, a personal computer device, etc. A NodeB is generally a fixed station and may also be referred to as an access point or some other equivalent terminology.\nDL signals consist of data signals carrying information content, control signals carrying DL Control Information (DCI), and Reference Signals (RSs), which are also known as pilot signals. A NodeB transmits data information or DCI to UEs through a Physical DL Shared CHannel (PDSCH) or a Physical DL Control CHannel (PDCCH), respectively.\nUL signals also consist of data signals, control signals and RSs. A UE transmits data information or UL Control Information (UCI) to a NodeB through a Physical Uplink Shared CHannel (PUSCH) or a Physical Uplink Control CHannel (PUCCH), respectively.\nA NodeB transmits one or more of multiple types of RSs, including a UE-Common RS (CRS), a Channel State Information RS (CSI-RS), and a DeModulation RS (DMRS). The CRS is transmitted over substantially the entire DL system BandWidth (BW), and can be used by all UEs to demodulate data or control signals or to perform measurements. A UE can determine a number of NodeB antenna ports from which a CRS is transmitted through a broadcast channel transmitted from the NodeB. To reduce the overhead associated with the CRS, a NodeB may transmit a CSI-RS with a density in the time and/or frequency domain that is smaller than that of the CRS, for UEs to perform measurements. A UE can determine the CSI-RS transmission parameters through higher layer signaling from the"} +{"output_text": " the valve head is attached to the other end of the plunger, whereby the valve head is moved in the direction of the axis of the conduit by the attractive force of the electromagnet.\nIn a further embodiment of the present invention, a cylindrical yoke for guiding a magnetic flux generated by the solenoid is disposed in the conduit at a position adjacent to the plunger, which yoke is movable in the direction of the axis of the conduit, whereby an initial position of a valve head", "input_text": " the problem of contamination.\nThe present invention provides a mass flow controller for controlling a mass flow rate in a predetermined range, in which a mass flow rate of a fluid is detected by a flow rate sensor and a control valve is operated so as to adjust the detected mass flow rate to a desired value. The control valve is arranged as a solenoid valve operated by means of a solenoid, and a plunger for opening and closing the solenoid valve is disposed within a cylindrical conduit having a hollow structure, whereby one-way flow of the fluid is effected in a space between an outer circumferential surface of the plunger and an inner circumferential surface of the conduit in a direction of the axis of the cylindrical conduit.\nIn one embodiment of the present invention, the outer circumferential surface of the plunger includes a groove extending in parallel to the axis of the conduit, to thereby provide a fluid flow path.\nIn another embodiment of the present invention, the plunger is made of a magnetic alloy having high anti-corrosion properties.\nIn a further embodiment of the present invention, the control valve comprises a spherical valve head attached to a forward end of the plunger and a valve seat corresponding to the valve head. The valve seat is arranged in a funnel-like form.\nIn a further embodiment of the present invention, a cylindrical yoke for guiding a magnetic flux generated by the solenoid is disposed in the conduit at a position adjacent to the plunger, which yoke is movable in the direction of the axis of the conduit, whereby an initial position of a valve head of the solenoid valve and an attractive force of an electromagnet can be adjusted by adjusting a gap between the plunger and the yoke.\nIn a further embodiment of the present invention, a spherical valve head is attached to one end of the plunger and a yoke having a funnel-like valve seat corresponding to"} +{"output_text": " in the network interface card (NIC) of the workstation.\nThe Internet is a worldwide network of computers. It is a network of networks that consists of millions of smaller networks, or Autonomous Systems (AS), connected to Internet 4. The Internet is a packet-switching network in which data is sent from a source to a destination in small units called packets. The Internet is a xe2x80x9cbest-effortxe2x80x9d network in", "input_text": " diagram of an IP network, which is an example of one type of flow-based network in which the technique of the present invention may be implemented. A flow can be a hard-state virtual circuit in an ATM network, a soft-state flow in an IP network (e.g., a MPLS tunnel), or a stateless connection as a TCP/IP connection in today\"\"s Internet. As shown in FIG. 1, the IP network 2 includes the Internet (or a WAN) 4 over which a Node 16 (e.g. a computer) can communicate with a separate node 6 via a plurality of intermediate nodes (e.g. R1, R3, R4). Node 6 may be, for example, a server which is part of Local Area Network (LAN) 7, connected to the Internet via routers R1 and R3. Router R3 (10) may, in turn, connect one or more other routers (e.g., router R2) with the Internet.\nA LAN is a communication network that serves users within a confined geographical area. It is made up of servers, workstations, a network operating system and a communications link. Servers are high-speed machines that hold programs and data shared by all network users. The workstations, or clients, are the user\"\" personal computers, which perform stand-alone processing and access the network servers as required The controlling software in a LAN is the network operating system, such as, for example, NetWare, UNIX, and/or Appletalk, which resides in the server. Message transfer is managed by a transport protocol such as, for example, IPX, SPX, SNA and/or TCP/IP. The physical transmission of data is performed by the access method (Ethernet, Token Ring, etc.) which is implemented"} +{"output_text": " the sample. The laser is focused on the surface of the sample and the temperature distribution is monitored. The temperature distribution is then used to calculate the thickness of the sample.\nIn the case of laser heating, the temperature distribution is monitored in the vicinity of the laser spot. The temperature distribution is then used to calculate the thickness of the sample. The temperature distribution is monitored in the vicinity of the laser spot. The temperature distribution is then used to calculate the thickness of the sample.\nIn the case", "input_text": " Plastics (FRP). The thermo-physical properties of such defects display a high contrast to the fibers and matrix of FRP. Such substantial contrasts allow the lateral conduction in the FRP to be disregarded and heat propagation within the sample be treated as a one-dimensional (1D) problem, making it possible to extract depth information from the thermography data. However, 1D model may only be valid if 3D diffusion can be ignored. For this to happen, one or several of the following criteria should be satisfied: The surface heating is uniform, so that there are no lateral gradients. The contrast in thermo-physical parameters between defect and sound regions of the sample is high enough to create temperature gradients much larger in comparison with deviation from one dimensional (1D) solution. The detection is performed shortly after the heat source is switched off, so that heat diffusion is minimal. Similarly this criterion can be defined if the location of the defect is close to the surface. \nAnother conventional approach is based on laser heating. In this approach, the heating may be performed in a non-uniform manner. Through such non-uniform heating, it is possible to detect defects that strongly affect lateral heat flow, like cracks. One of the recent examples is the flying laser spot thermography system. The interaction of laser with the surface is monitored continuously using an IR camera. When the laser spot is in the vicinity of a crack, the higher thermal resistivity of the crack leads to a reduced cooling and thus to a higher maximal temperature. Eventually, it gives rise to the thermal crack signature. By differentiation of the temperature profiles in different direction, the crack orientation can be reconstructed.\nFew thermography methods based on laser are also employed for material properties evaluation. For example, in Time Resolved Infrared Radiometry (TRIR), the heating with a laser is used to determine thickness of"} +{"output_text": ", pyrimidinyl, pyrazinyl, benzimidazolyl, quinolyl, isoquinolyl, cinnolinyl, phthalazinyl, quinazolinyl, quinoxalinyl, naphthyridinyl, pteridinyl, carbazolyl, xcex2-carbolinyl, acridinyl, phenazinyl, phenothiazinyl, and the like. The term xe2x80x9cheteroarylxe2x80", "input_text": " as cyclopropenyl, cyclobutenyl, cyclopentenyl, cyclohexenyl, cycloheptenyl, cyclooctenyl, and the like. The term xe2x80x9c(C3-C8)cycloalkenylxe2x80x9d includes (C3-C6)cycloalkenyl.\nThe term xe2x80x9carylxe2x80x9d represents phenyl or naphthyl.\nThe term xe2x80x9cbicyclicxe2x80x9d represents either an unsaturated or saturated stable 7- to 12-membered bridged or fused bicyclic carbon ring. The bicyclic ring may be attached at any carbon atom which affords a stable structure. The term includes, but is not limited to, naphthyl, dicyclohexyl, dicyclohexenyl, and the like.\nThe term, xe2x80x9cmono or bicyclic heteroaryl radicalxe2x80x9d, refers to radicals derived from monocyclic or polycyclic, aromatic nuclei having 5 to 14 ring atoms and containing from 1 to 3 hetero atoms selected from the group consisting of nitrogen, oxygen or sulfur. Typical heterocyclic radicals are pyrrolyl, furanyl, thiophenyl, pyrazolyl, imidazolyl, indolizinyl, isoquinolyl, benzothienyl, isoindolizinyl, oxazolyl, indolyl, carbazolyl, norharmanyl, azaindolyl, dibenzofuranyl, thianaphthenyl, dibenzothiophenyl, indazolyl, imidazo(1.2-A)pyridinyl, anthranilyl, purinyl, pyridinyl, phenylpyridinyl"} +{"output_text": "200,460; 5,200,461; 5,200,462; 5,200,463; 5,200,464; 5,200,465; 5,200,466; 5,200,467; 5,200,468; 5,200,469; 5,200,470; 5,200,471; 5,200,472; 5,200,473; 5,200,474; 5,200,475; 5,200,476;", "input_text": " latex is available from B.F. Goodrich under the trade designation HYCAR.\nRepresentative halogenated polyolefins include chlorinated natural rubber, chlorine- and bromine-containing synthetic rubbers including polychloroprene, chlorinated polychloroprene, chlorinated polybutadiene, hexachloropentadiene, butadiene/halogenated cyclic conjugated diene adducts, chlorinated butadiene styrene copolymers, chlorinated ethylene propylene copolymers and ethylene/propylene/non-conjugated diene terpolymers, chlorinated polyethylene, chlorosulfonated polyethylene, poly(2,3-dichloro-1,3-butadiene), brominated poly(2,3-dichloro-1,3-butadiene), copolymers of xcex1-haloacrylonitriles and 2,3-dichloro-1,3-butadiene, chlorinated poly(vinyl chloride) and the like including mixtures of such halogen-containing elastomers.\nLatices of the halogenated polyolefin can be prepared according to methods known in the art such as by dissolving the halogenated polyolefin in a solvent and adding a surfactant to the resulting solution. Water can then be added to the solution under high shear to emulsify the polymer. The solvent is then stripped to obtain a latex. The latex can also be prepared by emulsion polymerization of the halogenated ethylenically unsaturated monomers.\nButadiene latices are particularly preferred as the flexibilizer (B). Methods for making butadiene latices are well-known and are described, for example, in U.S. Pat. Nos. 4,054,547 and 3,920,600, both incorporated herein by reference. In addition, U.S. Pat. Nos. 5,200,459; 5,"} +{"output_text": "a and TEB 24b are connected to the downstream communication medium 22. The modules TEA 24a and TEB 24b are connected to the network termination unit 21 via a two-wire metallic line. The modules TEA 24a and TEB 24b are also connected to the upstream communication medium 23 via a two-wire metallic line.\nThe modules TEA 24a and TEB 24b are also connected to the upstream communication medium 23 via a two-wire metallic line. The", "input_text": " the transmit data encoder 13b to the transmit channel access unit 13a, then placed on the communication medium 11. The encoded transmit data are also furnished to the collision detect unit 13e.\nThe collision detect unit 13e compares the encoded transmit data with the receive data received from the communication medium via the receive channel access unit. If the two data do not match, presumably due to a collision with data transmitted on the communication medium by another functional module, the collision detect unit 13e notifies the data link layer of the unmatch via a lead 16. Upon receiving such notification, the data link layer stops sending data and prepares to resume transmission later.\nAnother well-known contention control system is found in the physical layer (layer 1) of the basic interface (I interface) described in recommendation I.430 in the Integrated Services Digital Network (ISDN) I series of recommendations of the CCITT. This recommendation stipulates the use of special D and E channels for detecting collisions when two or more functional modules attempt to access the communication medium simultaneously.\nFIG. 3 is a block diagram of layer 1 of an ISDN network with this CCITT interface. The network includes a network termination (NT) unit 21 having an echo generator 21a connected to a downstream communication medium 22 and an upstream communication medium 23, both of which comprise two-wire metallic lines. In the drawing, a transmit data frame 26 is shown on the upstream communication medium 23. The frame comprises several channels, one of which is a one-bit D channel 26a. The network termination unit 21 receives the transmit data frame 26 and the echo generator 21a adds an echo signal E which it creates by copying the D-channel data 26a. The resulting frame is placed on the downstream communication medium 22 as the receive data frame 27.\nFunctional modules (TE--terminal equipment) such as the modules TEA 24"} +{"output_text": " group. The controller of the ACD may then transfer the call to the other agent or agent group.\nIn a further aspect of the invention, a method and apparatus are provided for transferring a call from a first agent to a second agent. The method includes the steps of: (a) receiving a call from a first agent; (b) determining whether the call is to be transferred to a second agent; (c) if the call is to be transferred, transferring the call to the second", "input_text": " organization may receive calls directed to different call targets over the same trunk lines. In such a case, the call target may be identified to the ACD by a pulse code modulated (PCM) signal transferred from the PSTN to the controller of the ACD by a dialed number identification service (DNIS) operating from within the PSTN.\nIn systems associated with service organizations, where many calls are received and handled by many agents, it may be important for an agent to have ready access to customer files. In such a situation, a database is maintained of existing customers. Customer records may be displayed on agent terminals as the agents converse with specific customers. In some cases, the customer may be identified to the database for display of records on the terminal by the agent entering a customer identifier into a keyboard associated with the terminal. Alternatively, the controller of the ACD may transfer an identifier of the customer to the database based upon an automatic number identification (ANI) facility, operating from within the PSTN.\nWhere ANI is used, the controller of the ACD receives the ANI digits (identifying the caller via the caller's telephone number) at the same time the call arrives from the PSTN. Upon selecting an agent, the controller may transfer the call to a queue for the selected agent or directly to the selected agent. At the same time that the call is delivered to the agent, the controller sends an identifier of the selected agent and ANI number of the customer to a controller of the database (the host). The host, in turn, displays the customer records via a computer monitor of the selected agent at the same time the call is delivered.\nAs a further feature, calls may be transferred among agents. Where a first agent finds that he or she cannot help a particular customer, the agent may activate a key on a keyboard of the agent and enter an identity of another agent or agent"} +{"output_text": " 3A to 3(C) show default indexes for list 0 prediction, default indexes for list 1 prediction and list 1 reference pictures for direct mode of respective B pictures in an IBBB pattern using only the B pictures, respectively. In FIG. 3A, when a B picture to be coded is B8, a temporally preceding B5 with a list 1 index 0 is a list 1 reference picture for direct mode. As shown in FIG. 3(B), a list 1 reference picture", "input_text": " reference picture for direct mode is a list 0 reference picture pointed by a motion vector of a co-located block in the list 1 reference picture for direct mode.\nFIGS. 1(A) to 1(C) show default indexes for list 0 prediction, default indexes for list 1 prediction and list 1 reference pictures for direct mode of respective B pictures in an IBBBP pattern when the number of available list 0 and list 1 reference pictures (or the size of a short-term buffer) is 6, respectively. Here, the default indexes for list 0 prediction and the default indexes for list 1 prediction are dependant on an output order, or POC value, of a previously decoded reference picture regardless of a decoding order. In FIG. 1, all the B pictures use a temporally following P picture as the list 1 reference picture for direct mode.\nFIGS. 2A to 2(C) show default indexes for list 0 prediction, default indexes for list 1 prediction and list 1 reference pictures for direct mode of respective B pictures in an IBBB pattern using only the B pictures, respectively. In FIG. 2A, when a B picture to be coded is B8, a temporally preceding B5 with a list 1 index 0 is a list 1 reference picture for direct mode. As shown FIG. 2(B), a list 1 reference picture for direct mode of B7 to be subsequently decoded is the temporally following B8. Last, as shown in FIG. 2(C), a list 1 reference picture for direct mode of B9 to be subsequently decoded is the temporally preceding B7.\nIn conclusion, as seen from FIGS. 1(A) to 2(C), a list 1 reference picture for direct mode may be a P or B picture temporally following a B picture to be coded, or a B picture temporally preceding it.\nFIGS."} +{"output_text": "\u2014.\nThe term \u201cheteroaryl\u201d designates an aryl group in which one or more carbons have each been replaced by a heteroatom independently selected from the group consisting of oxygen, sulfur and nitrogen (NH). Heteroaryl groups can preferably contain 1, 2, or 3 heteroatom(s) and more preferably one heteroatom, independently selected from the group consisting of oxygen, sulfur and nitrogen (NH), as ring member(s). Heteroaryl groups can preferably", "input_text": "\u2014(C1-5 alkyl), \u2014O\u2014CF3, \u2014S\u2014CF3, phenyl, and \u2014O-benzyl, and the aforementioned heteroaryl radicals can each optionally exhibit 1, 2, 3, 4, or 5 heteroatom(s) independently selected from the group consisting of oxygen, nitrogen, and sulfur as ring member(s).\nThe term \u201cheteroalkylene\u201d designates an alkylene chain in which one or more carbons have each been replaced by a heteroatom independently selected from the group consisting of oxygen, sulfur and nitrogen (NH). Heteroalkylene groups can preferably contain 1, 2, or 3 heteroatom(s) and more preferably one heteroatom, independently selected from the group consisting of oxygen, sulfur and nitrogen (NH), as link(s). Heteroalkylene groups can preferably be two to six-membered and more preferably two or three-membered.\nExamples of heteroalkylene groups include \u2014CH2\u2014CH2\u2014O\u2014CH2\u2014, \u2014CH2\u2014CH(CH3)\u2014O\u2014CH2\u2014, \u2014(CH2)\u2014O\u2014, \u2014(CH2)2\u2014O\u2014, \u2014(CH2)3\u2014O\u2014, \u2014(CH2)4\u2014O\u2014, \u2014O\u2014(CH2)\u2014, \u2014O\u2014(CH2)2\u2014, \u2014O\u2014CH2)3\u2014, \u2014O\u2014(CH2)4\u2014, \u2014C(C2H5)\u2014(H)\u2014O\u2014, \u2014O\u2014C(C2H5)\u2014(H)\u2014, \u2014CH2\u2014O\u2014CH2\u2014, \u2014CH2\u2014S\u2014CH2\u2014, \u2014CH2\u2014NH\u2014CH2\u2014, \u2014CH2\u2014NH\u2014, and \u2014CH2\u2014CH2\u2014NH\u2014CH2\u2014CH2"} +{"output_text": "HTML) format, the file is associated with the \u201cEXCEL\u201d spreadsheet program. However, if the same electronic file is created using the \u201cEXCEL\u201d spreadsheet program and is saved in a \u201c.DOC\u201d format, the file is associated with the \u201cEXCEL\u201d spreadsheet program.\nIn this example, the file is associated with two different application programs. However, the user cannot associate the file with both application programs at the same time. The user must select one of the", "input_text": " saved in an \u201cEXCEL\u201d specific format. In this manner, the file is associated with the corresponding electronic spreadsheet program.\nThe use of file extensions has several benefits. First, the file extension allows the user to quickly identify the electronic files that are associated with a particular application when the user views a list of files contained within a directory or a folder. Second, and more importantly, the extension associates the electronic file with the particular application program which was used to create the file. The logical association, which is typically stored in a look-up table within the computer system or disk, allows simple, easy file management by the user.\nFor example, the user can open an electronic file by selecting the electronic file with a pointing device, such as a mouse. The computer operating system retrieves the extension, locates the extension in the look-up table stored in the computer system or disk, and retrieves and launches the associated application program using the electronic file as input.\nAnother method of associating an electronic file with an application program is to write an identifier tag within the electronic file to indicate which application program is associated with the electronic file. The identifier tag associates the format of the electronic file with a particular application and is stored within the electronic file itself. The logical association of the external identifier tag to the particular application is stored in a look-up table in the computer system or hard disk drive.\nHowever, each prior method of file identification has the drawback that only one application program can be associated with an electronic file. Usually, this does not pose a problem to users because most electronic file operations can be performed by one application program. However, there are instances when an electronic file needs to be associated with two or more different application programs.\nFor example, if an electronic file is created using the \u201cEXCEL\u201d spreadsheet program and is saved in a Hypertext Markup Language ("} +{"output_text": " use in the network. The \u201cscanning\u201d is carried out by the servers in the data network, and the end points are configured automatically by the servers.\nThe known scanning method is disadvantageous in that it is not possible to configure end points which have been added to the network manually. Furthermore, the known scanning method is not able to configure end points which have been added to the network manually, and which are not able to receive configuration data from the servers.\nIt is an object of", "input_text": ") which can be used in this network segment.\nIn known data networks, it has been found to be disadvantageous that devices must be configured manually at regular intervals, which is associated with a large amount of labor effort. This is particularly true when an existing configuration has to be changed because, for example, a central device in the corresponding data network has changed its network address, or when central devices are added to the network, or are removed from it. Whenever the association between end points and central devices in the data network is changed, this results in the necessity to change the configuration of the end points in real time.\nAlthough the use of the DHCP method as described above allows the configuration of devices with an IP address, an IP subnetwork mask and with the address of a DNS server, it is, however, possible only to a restricted extent with the known DHCP servers to transmit to the devices (computers) an amount of configuration data which is significantly greater than this \u201cbasic configuration\u201d. As mentioned above, the clients in speech data networks, in particular, must be supplied not only with the \u201cbasic configuration\u201d but also with a large number of other information items (parameters). Furthermore, although the DHCP method is able to assign a newly connected device a free IP address from the range of available free IP addresses, this is not intended to allow, for example, selection of a suitable gateway for a device from a number of gateways in a data network, and to assign this for use.\nIt is known for the association between end points and central devices (servers) to be updated automatically by the servers in a data network carrying out so-called \u201cscanning\u201d at regular time intervals. The aim of the \u201cscanning\u201d is to find end points which have been added to the network and to send all of the necessary information to these end points in order to configure them for"} +{"output_text": " of 6 to 20 carbon atoms, or aralkyl groups of 7 to 12 carbon atoms.\nOf the groups represented by R107, R108 and R109, exemplary alkyl groups include methyl, ethyl, propyl, isopropyl, n-butyl, sec-butyl, tert-butyl, pentyl, hexyl, heptyl, octyl, amyl, cyclopentyl, cyclohexyl, cycloheptyl, norbornyl, and adamantyl. Exemplary", "input_text": " of Formula (P2) \nHerein, R105 and R106 independently represent straight, branched or cyclic alkyl or halogenated alkyl groups of 1 to 12 carbon atoms, aryl or halogenated aryl groups of 6 to 20 carbon atoms, or aralkyl groups of 7 to 12 carbon atoms.\nOf the groups represented by R105 and R106, exemplary alkyl groups include methyl, ethyl, propyl, isopropyl, n-butyl, sec-butyl, tert-butyl, pentyl, hexyl, heptyl, octyl, amyl, cyclopentyl, cyclohexyl, cycloheptyl, norbornyl, and adamantyl. Exemplary halogenated alkyl groups include trifluoromethyl, 1,1,1-trifluoroethyl, 1,1,1-trichloroethyl, and nonafluorobutyl. Exemplary aryl groups include phenyl; alkoxyphenyl groups such as p-methoxyphenyl, m-methoxyphenyl, o-methoxyphenyl, ethoxyphenyl, p-tert-butoxyphenyl, and m-tert-butoxyphenyl; and alkylphenyl groups such as 2-methyl-phenyl, 3-methylphenyl, 4-methylphenyl, ethylphenyl, 4-tert-butylphenyl, 4-butylphenyl, and dimethylphenyl. Exemplary halogenated aryl groups include fluorophenyl, chlorophenyl, and 1,2,3,4,5-pentafluorophenyl. Exemplary aralkyl groups include benzyl and phenethyl.\n(iii) Glyoxime Derivatives of Formula (P3) \nHerein, R107, R108 and R109 independently represent straight, branched or cyclic alkyl or halogenated alkyl groups of 1 to 12 carbon atoms, aryl or halogenated aryl groups"} +{"output_text": " for new polymers which are useful in hair styling products. In particular, there is a need for new polymers which are useful in hair styling products which are capable of providing a high degree of hair styling hold, and which are also capable of providing a high degree of hair styling hold when the hair styling product is applied to wet hair.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a semiconductor device having a multilayer wiring structure.\nIn", "input_text": " or water solvent, and then set to form rigid welds between hair fibers when the solvent evaporates as the hair dries. These hair fiber welds form the basis for the style hold ability of conventional hair styling products. When these welds are broken, they remain broken unless the appropriate polymer solvent is added to redissolve the adhesive and reform the welds when the hair dries.\nIn addition, many polymers said to be useful in hair styling products are multi-component polymers which combine three, four, and even more monomers into the polymer chains. Frequently, one of the monomer components is vinyl pyrrolidone. Examples of such complex polymer systems are found in U.S. Pat. Nos. 3,222,329 to Grosser, et al., issued Dec. 7, 1965; 3,577,517 to Kubot, et al., issued May 4, 1971; 4,012,501 to Farber, issued Mar. 15, 1977; and 4,272,511 to Papantoniou and Mondet, issued June 9, 1981; the disclosures of all these patents being incorporated herein by reference in their entirety.\nOther polymers said to be useful for hair styling compositions have been disclosed, such as block polymers. These block polymers have two or more glass transition temperatures. Examples of such block polymer systems are found in U.S. Pat. Nos. 3,907,984 to Calvert, et al., issued Sept. 23, 1975; 4,030,512 to Papantoniou, et al., issued June 21, 1977; and 4,283,384 to Jacquet, et al., issued Aug. 11, 1981; the disclosures of all these patents being incorporated herein by reference in their entirety.\nNotwithstanding the great effort already put forth to identify these adhesive polymers for use in temporary set hair styling products, there remains a continuing need"} +{"output_text": ", a resin composition for the cage is injected into a mold cavity. The mold cavity is formed by a pair of mold halves. The mold halves are provided with a plurality of sprue holes for injecting the resin composition for the cage. The sprue holes are arranged in a circumferential direction of the mold cavity. The sprue holes are arranged at a predetermined pitch in the circumferential direction of the mold cavity. The sprue holes are arranged at a predetermined pitch in the circumferential direction of the mold cavity.", "input_text": "). In this case several pairs of semiconductor switches connected in series may be connected in parallel. In addition, the semiconductor switches may be constituted by a large number of individual semiconductor-switch modules with, in each case, low switching capacities. Through the use of many semiconductor-switch elements with, in each case, a relatively low switching capacity which are, however, easy to connect in parallel, effective cooling can be achieved, since the many individual components can be reached well by the cooling medium. In rotating a rolling bearing incorporating a resin-made cage at a high speed, a centrifugal force generated owing to a high-speed rotation acts on a cage. As a result, the cage deforms. Owing to the deformation of the cage, the friction between the cage and balls held by the cage becomes high, and the torque of the rolling bearing becomes high. An increase of the friction therebetween causes the bearing to generate heat. Further owing to the deformation of the cage, the cage may contact an outer ring of the bearing. Resin melts owing to frictional heat caused by the contact between the cage and the outer ring. Thereby there may be a case where the rolling bearing is prevented from rotating. Thus the resin-made cage to be incorporated in the rolling bearing which is used at a high-speed rotation is an important bearing member.\nTo restrain the cage from deforming when the rolling bearing rotates at a high speed, it is necessary to increase the mechanical strength and elastic modulus of a resin composition for the cage. To this end, normally this problem is dealt with by increasing the mixing amount of a fibrous reinforcing material such as glass fiber in the resin composition.\nA \u201ccrown-type cage\u201d is used as the resin-made cage of a deep groove ball bearing which is a kind of the rolling bearing. At a production time of the crown-type cage by injection molding"} +{"output_text": " standard does not provide a mechanism for a central processing system to validate and format an electronic medical claim that is transmitted to a trading partner in a format that is different from the format used by the central processing system. What is needed is a system that does not require a central processing system to transmit electronic medical claims to a trading partner in a format that is the same as the format used by the central processing system.\nAnother limitation of systems previously known is that the central processing system must be able to communicate", "input_text": " and format the flow of data from a standardized physician's terminal to the central processing system of this patent is provided in the claims, insured, physician, insurance company, zip code, bad credit card, and insurance check files associated with a variety of databases at the central processing system. The information in these files must be provided by a plurality of insurance carriers and employers that receive electronic claims from the central processing computer. As a consequence, the maintenance and updating of these databases with information from the insurance carriers and employers must be performed at the central processing system. What is needed is a system that does not require a centralized database for validating and formatting an electronic medical claim that must be maintained with insurance carrier data.\nAnother limitation of systems previously known include the requirement that the central processing system transmit medical claims to insurance carriers and receive remittance data from insurance carriers in the same communication format and protocol used by the computer stations at the insurance carriers. In an effort to standardize both forms of communication, ANSI (American National Standards Institute) has generated an ANSI 837 standard for medical claims and an ANSI 835 for remittance data that specifies the format for a variety of message types that contain the various types of information to be exchanged among the central processing system and the computer stations within a typical computer system used in the healthcare industry. One limitation of the ANSI standards, however, is that a number of data fields in the data messages specified by the standard are optional and may or may not be used by one or more of the insurance carriers that are members of a medical claim processing system. Typically, insurance carriers, sometimes called trading partners, contract with a business partner who runs a central processing system to provide the carrier with the electronic medical claims from the healthcare providers. Although the optional data fields provided in the message formats specified by the ANSI standards support different variations within the standard, the ANSI 837"} +{"output_text": " with the semiconductor laser 10 in accordance with data EFM2; numeral 13 denotes a switch means for disconnecting the current source 7 with the semiconductor laser 10 in accordance with data EFM3; numeral 14 denotes a switch means for disconnecting the current source 8 with the semiconductor laser 10 in accordance with data EFM4; numeral 15 denotes a switch means for disconnecting the current source 9 with the semiconductor laser 10 in accordance with data EFM5; numeral 16 denotes a switch", "input_text": " a bottom hold circuit for holding a bottom level of the output from the monitor circuit 2; numeral 21 denotes a sample-hold circuit for sample-holding the output from the monitor circuit 2; numeral 22 denotes a peak hold circuit for holding a peak level of the output from the monitor circuit 2; numeral 19 denotes a control circuit for outputting a first, a second and a third digital signals corresponding to a bias reference voltage, an erase power reference voltage and a peak power reference voltage, respectively; numeral 26 denotes a D/A converter for converting the first digital signal outputted by the control circuit 19 to a bias reference voltage; numeral 27 denotes a D/A converter for converting the second digital signal outputted by the control circuit 19 to an erase reference voltage; numeral 28 denotes a D/A converter for converting the third digital signal outputted by the control circuit 19 to a peak power reference voltage; numeral 23 denotes a servo amplifier for comparing the bias reference voltage outputted by the D/A converter 26 with the bottom level which is held in the bottom hold circuit 20 to amplify the error; numeral 24 denotes a servo amplifier for comparing the erase reference voltage outputted by the D/A converter 27 with a sample-hold level which is held in the sample-hold circuit 21 to amplify the error; numeral 25 denotes a servo amplifier for comparing the peak power reference voltage outputted by the D/A converter 28 with the peak hold level which is held in the peak hold circuit 22 to amplify the error; numerals 7,8 and 9 denote current sources for generating the currents corresponding to the outputs of the servo amplifiers 23,24 and 25, respectively; numeral 11 denotes a switch means for disconnecting the current source 8 with the semiconductor laser 10 in accordance with data EFM1; numeral 12 denotes a switch means for disconnecting the current source 9"} +{"output_text": " antibodies involves the use of a maleimide-containing reagent. (See, e.g., U.S. Pat. No. 5,225,539 issued on Jul. 6, 1993 to Bauer et al.). This method has been reported to produce antibodies which are capable of binding to a target molecule with high affinity and specificity.\nHowever, the above-described methods of site-specific attachment of molecules to antibodies are not always suitable for the preparation of antibodies which are capable of binding to", "input_text": " or between protein molecules. Thus, there is a danger that if a naturally occurring cysteine residue is used as a site of attachment, it will interfere with the normal folding and stabilization of the antibody protein.\nIn an effort to obviate such problems, alternative strategies have been developed which provide for site-selective attachment of a desired molecules to antibodies, without loss of antigen-binding activity. For example, it is known to produce recombinant antibodies comprising cysteine residues introduced into their surface structure to provide a thiol group which is available for covalent binding to an effector or reporter molecule. This method has been reported to facilitate the site-specific attachment of desired molecules without loss of antigen binding properties. (See, U.S. Pat. No. 5,219,996 issued on Jun. 15, 1993 to Bodmer et al.) However, this is not always possible or convenient since it obviously requires the possession of a recombinant DNA encoding the particular antibody.\nIt has further been proposed to derivatize immunoglobulins by selectively introducing sulfhydryl groups in the Fc region of an immunoglobulin, using reaction conditions which purportedly do not result in alteration of the antibody combining site. Antibody conjugates produced according to this methodology are disclosed to exhibit improved longevity, specificity and sensitivity (U.S. Pat. No. 5,196,066 issued on Mar. 2, 1993 to Bieniarz et al.).\nSite-specific attachment of effector or reporter molecules, wherein the reporter or effector molecule is conjugated to a carbohydrate residue in the Fc region has also been disclosed in the literature. (See, e.g., O'Shannessy et al., J. Immun. Meth., 99, 153-161 (1987)). This approach has been reported to produce diagnostically and therapeutically promising antibodies which are currently in clinical evaluation.\nAnother known method of site-specific attachment of molecules to"} +{"output_text": " resistors 36 and 37.\nThe differential amplifier 20 is a common-emitter amplifier. The emitter resistors (24 and 25) are in series with the emitters of the two transistors 22 and 23. The emitter resistors are also in series with the emitters of the two transistors 22 and 23. The emitter resistors are also in series with the emitters of the two transistors 22 and 23. The emitter resistors are also in series with the emitters of the two transistors 22 and", "input_text": "+ by collector load resistors 27 and 28 which each have a resistance value of R.sub.C. The base leads of the two transistors form a differential input port 32 and leads from their collectors form a differential output port 34. The differential voltage gain from input port 30 to output port 32 is given by an approximate gain expression of R.sub.C / R.sub.E.\nThis approximate gain expression ignores the effects of various transistor parameters which include the small-signal emitter resistance r.sub.e (resistance of the forward biased base-emitter junction), current gain.beta. (the ratio of collector current change to a change of base current) and the Early voltage V.sub.A (an imaginary voltage useful in defining an increase in collector current which occurs as the collector-emitter voltage is increased).\nThe emitter resistance r.sub.e modifies the differential amplifier's approximate gain expression because, in each of the transistors 22, 23, this resistance is in series with that transistors' external source resistor (24 or 25). Changes in current gain.beta. (e.g., because of temperature movement) cause current variations across load resistors which results in altered amplifier gain. The Early voltage is associated with collector current changes that result when the base width is modulated by variations in the collector-emitter voltage. These collector current variations cause undesirable changes in the amplifier gain.\nAll transistor parameters are sensitive to changes in operating conditions, e.g., temperature, bias current and supply voltage. Their values also vary across transistor production lots. Therefore, the differential amplifier 20 finds its most effective use in circuits that can accommodate significant changes in the approximate gain expression of R.sub.C / R.sub.E. In applications that demand a more stable and accurate gain, the differential gain can be accurately set by the use of negative feedback, e.g., by the addition of feedback"} +{"output_text": " through the cylinder. The rods are connected to the pistons by wristpins that extend through the cylinder. The wristpins are connected to the crankshafts by a long gear train. The wristpins are also connected to the crankshafts by a second gear train. The second gear train is a short gear train that is located between the wristpins and the crankshafts. The second gear train is used to couple the crankshafts to the output drive. The second gear", "input_text": " transit of the piston rings past the ports I and E. The engine has two crankshafts C1 and C2, one disposed at each end of the cylinder. The crankshafts, which rotate in the same direction, are linked by rods R1 and R2 to respective pistons. Wristpins W1 and W2 link the rods to the pistons. The crankshafts are geared together to control phasing of the ports and to provide engine output. Typically, a turbo-supercharger is driven from the exhaust port, and its associated compressor is used to scavenge the cylinders and leave a fresh charge of air each revolution of the engine. The advantages of Junkers' opposed piston engine over traditional two-cycle and four-cycle engines include superior scavenging, reduced parts count and increased reliability, high thermal efficiency, and high power density. In 1936, the Junkers Jumo airplane engines, the most successful diesel engines to that date, were able to achieve a power density and fuel efficiency that have not been matched by any diesel engine since. According to C. F. Taylor (The Internal-Combustion Engine in Theory and Practice: Volume II, revised edition; MIT Press, Cambridge, Mass., 1985): \u201cThe now obsolete Junkers aircraft Diesel engine still holds the record for specific output of Diesel engines in actual service (Volume I, FIG. 13-11).\u201d\nNevertheless, Junkers' basic design contains a number of deficiencies. The engine is tall, with its height spanning the lengths of four pistons and at least the diameters of two crankshafts, one at each end of the cylinders. A long gear train with typically five gears is required to couple the outputs of the two crankshafts to an output drive. Each piston is connected to a crankshaft by a rod that extends"} +{"output_text": " like, it is not uncommon for a pacemaker to malfunction in a manner that is not readily apparent to the user.\nFor example, a pacemaker may be implanted in a patient and may operate without incident for a period of time. However, the battery power source may become depleted, or the circuitry may malfunction, such that the pacemaker is unable to sense or stimulate the heart at a prescribed rate. In such a case, the patient may not be aware of the malfunction, and may not", "input_text": " and/or epicardial leads to sense amplifiers housed in an implanted pacemaker. Electrical activity occurring in these chambers can thus be sensed. When electrical activity is sensed, the pacemaker assumes that a depolarization or contraction of the indicated chamber has occurred. If no electrical activity is sensed within a prescribed time interval, typically referred to as an atrial or ventricular escape interval, then a pulse generator, also housed within the pacemaker housing, generates a stimulation pulse that is delivered to the indicated chamber, usually via the same lead or electrode as is used for sensing. This stimulation pulse causes or forces the desired depolarization and contraction of the indicated chamber to occur. Thus, with a demand pacer, the heart will either beat on its own (without stimulation from the pacemaker) at a rate that is at least just slightly faster than the stimulation rate defined by the escape interval, or the heart will be stimulated by the pacer at a rate controlled by the escape interval. The stimulation rate provided by the pacemaker is typically referred to as the \"programmed rate.\"\nAs noted, most pacemakers include a sensor circuit that looks for electrical signals from spontaneous heart activity. On detection of such activity, the pacemaker stimulation action is modified, depending upon the functional mode or type of pacemaker. For example, in the VVI mode (ventricle paced and sensed, response inhibited mode), sensing of heart activity under certain time restrictions is interpreted as normal heart activity such that the stimulating action is inhibited.\nThe discussion thus far has followed the assumption that a pacemaker and its associated circuitry operate without malfunction. By the very nature of manmade devices, such is not always the case. Whereas electronic circuitry can be, and is, incorporated within the pacemaker itself for exercising or testing various circuit components, the status of battery power sources, and the effectiveness of various amplifiers, waveform shaping stages and the"} +{"output_text": " has been drilled, the wire rod is removed from the well and is then subjected to a mechanical cleaning operation. The wire rod is generally subjected to a cleaning operation in which the wire rod is subjected to a mechanical action which removes the scale and other impurities from the surface of the wire rod. The wire rod is generally subjected to a cleaning operation in which the wire rod is subjected to a mechanical action which removes the scale and other impurities from the surface of the wire rod. The wire rod is generally subjected", "input_text": " compromises with negative effects in terms of operation of the device: for example, a necessary reduction in the speed of rotation or smaller dimensions of the device in the axial direction, i.e. parallel to the direction of feeding of the wire rod.\nThe object of the present invention is therefore to eliminate the drawbacks mentioned above. In accordance with the invention, this object is achieved by means of a device for mechanically cleaning wire rod, of the type indicated in the preamble of claim 1, in which the jaws which grip the steel wool, or other abrasive material, against the wire rod are supported by operating members consisting of adjacent and independent pairs of floating plates.\nThe main advantage obtained by means of the present invention consists in the fact that a uniform distribution, in the axial direction, of the pressure exerted by the jaws on the abrasive material is achieved and therefore the efficiency of the device is optimized. Moreover, the mechanical balancing, which is independent for each pair of plates, is advantageously achieved without obstructing the zone where the steel wool rubs against the wire rod; the structure of the plates may be lightened since their mutual independence replaces the considerable inertia of the monolithic device; as a result of the modularity of the jaws formed in this way, it is also possible to provide devices with dimensions satisfying the most widely varying requirements, keeping to a minimum the warehouse supplies, formed by plates which are all identical to each other. Finally, an important advantage consists in the fact that, for the same dimensions compared to the previously known device, the jaws may be formed by two sets of pairs of rotating plates which rotate in opposite directions to each other: in this way, twisting of the wire rod is compensated for and the surface of the latter has, etched on it, a double helix which will ensure a better propensity for retaining the lubricant and therefore an improved cleaning quality. Once a petroleum well"} +{"output_text": "ered the use of higher frequency operation.\nA method for fabricating a miniature torsional scanning mirror has been described by K. Petersen in U.S. Pat. No. 4,926,811. The mirror is fabricated by bonding a silicon wafer to a glass substrate. The mirror is suspended by a central torsion member having a mirror surface. The torsion member is suspended by a pair of torsion arms having a mirror surface. The torsion arms are suspended by a pair of torsion wires", "input_text": "050. In general, the larger the mass of the element used to perform fine track servoing, the lower the servo bandwidth becomes and the lower the track density that can be read or written.\nA method for moving a folding prism or mirror with a galvanometer actuator for fine tracking has been disclosed by C. Wang in U.S. Pat. No. 5,243,241. The galvanometer consists of bulky wire coils and a rotatable magnet mounted on a linear actuator arm attached to a flying magneto-optical head, but not mounted on the slider body itself. This design limits the tracking servo bandwidth and achievable track density due to its size and weight. Its complexity also increases the cost and difficulty of manufacture.\nMiniature torsional scanning mirrors have been described, viz, \"Silicon Torsional Scanning Mirror\" by K. Petersen, IBM J. Res. Develop., Vol. 24, No. 5 Sept 1980, pp. 631-637. These mirrors are generally prepared using procedures developed in the semiconductor processing arts. Petersen describes a torsion mirror structure having a 134.mu.m thick silicon wafer defining a distal frame suspending a central silicon mirror element suspended by lateral torsion members therebetween. The lateral mirror dimensions are about 2.1 by 2.2 mm. The mirror is bonded over a 7 to 10.mu.m deep etched well in a glass slide substrate, having evaporated electrodes deposited therein. The mirror is rotationally deflected by voltages applied between the mirror and the electrodes by connecting wires. Scanning angles of up to 2 degrees at a resonant operating frequency of up to 15 kHz were reported. The size and mass of the mirror limited higher operating frequency. Also, mirror distortion caused by the high dynamic torque (i.e., peak angular acceleration ) at higher frequency was a limiting factor. The high mechanical Q of prior art mirrors hind"} +{"output_text": " the display were perfectly bright, the display would be invisible to the driver.\nThe prior art has attempted to solve the problem of driver visibility by providing a display which is brighter than the ambient light. However, this approach has not been entirely satisfactory. The brightness of the display is limited by the power available to drive the display. The brightness of the display is also limited by the ability of the display to dissipate heat. The brightness of the display is further limited by the ability of the display to", "input_text": " but move throughout an elliptical viewing area known as the eye motion box or the eyellipse. Drivers also have different seated body lengths and prefer different seat height and position adjustments. An eyellipse of about 8\"H.times.5\"V.times.10\"D centered at about 0.5\" from the instrument panel will accommodate most of the driver population. The typical instrument panel viewing angel (i.e., the line-of-sight used to see the instrument panel from the eyellipse) is about 19.degree. below horizontal and the angular subtense (i.e., the amount of scan used to see the entire instrument display) is about 24.degree. H.times.6.degree. V.\nAdditional complications are caused by the problem of vertical disparity or divergence. When an object field is viewed through an optical system, each eye typically sees a somewhat different view. Vertical disparity is the angular difference along the vertical axis of an object point as viewed by each eye. Vertical disparity has a bearing upon driver viewing comfort. A driver's tolerance limit to vertical disparity influences the complexity of the display optics. An instrument display system should reduce vertical disparity to a level which is commensurate with driver comfort while not unduly complicating the display optics.\nStill further complications are caused by the high ambient light conditions which are present in most automobiles. Ambient light includes direct sunlight and specular reflections from surrounding objects which can shine into the driver's eyes and reduce display visibility. The instantaneous dynamic range of an eye adapted to a typical horizon sky luminance of about 3,000 foot-Lamberts (fL) is on the order of about 600:1. Hence, the black level for this eye is about 5fL and all stimuli at luminancelevels of 5fL or less look equally black. Hence, even if there were no transmission losses and"} +{"output_text": "ase inhibitors.\nFurther, in addition to the above-mentioned chymase inhibitors, a compound represented by the following formula (A) is disclosed as a chymase inhibitor (see Patent Document 23).\n\nHowever, there is no description of the compound represented by the above formula (A) in the above document.\nFurther, a compound represented by the following formula (B) is disclosed as a chymase inhibitor (see Patent Document 24).\n\nHowever, there is no", "input_text": "Lipid accumulation in the aorta: see Non-Patent Document 29,\nDiabetes: see Non-Patent Document 30,\nNephritis: see Non-Patent Document 31,\nFibrosis: see Non-Patent Document 32 and Non-Patent Document 33,\nPost-operative adhesion: see Non-Patent Document 34,\nGlaucoma: see Patent Document 1,\nHypereosinophilia: see Patent Document 2,\nAtopic dermatitis: see Non-Patent Document 35,\nPruritus: see Non-Patent Document 36,\nAsthma: see Non-Patent Document 37,\nEnteritis: see Non-Patent Document 38.\nFurther, recently, in addition to the chymase inhibitors described in the above books and reviews, imidazoledinedione derivatives (see Patent Document 3), phosphonic acid and phosphinic acid derivatives (see Patent Document 4, Patent Document 5, and Patent Document 6), benzothiophene sulfonamide derivatives (see Patent Document 7 and Patent Document 8), imidazole, thiazolimine, and oxazolimine derivatives (see Patent Document 9 and Patent Document 10), triazolidine derivatives (see Patent Document 11), pyridone derivatives (see Patent Document 12), indole derivatives (see Patent Document 13 and Patent Document 14), ring-fused pyrrole derivatives (see Patent Document 15), imidazopyridine derivatives (see Patent Document 16), benzimidazolone derivatives (see Patent Document 17), quinazolinedinone derivatives (see Patent Document 18), phthalazinone derivatives (see Patent Document 20), azabenzoimidazolone derivatives (see Patent Document 21), and azaquinazolinedinone derivatives (see Patent Document 22) are disclosed as novel chymase inhibitors. However, there are no examples of practical application of the above chym"} +{"output_text": " to users based on the user's response to an offer to receive a free phone.\nThe airlines 100, 200 have established marketing agreements with credit card companies, phone companies, hotel chains and car rental companies to provide the business traveler with a more robust way to generate rewards in the form of frequent flyer miles. These marketing arrangements or associations have typically involved credit card companies, phone companies, hotel chains and car rental companies. Any purchases made through these \u201cco-branded\u201d partners were then", "input_text": "cash\u201d online, for example by visiting a portal site. The earnings can then be used to make online purchases, such as software.\nAgain, although the '210 patent and the CyberGold web site describe an incentive system that allows users to purchase products or services over the Internet, neither teaches the ability of the redeeming frequent flyer miles from a pre-existing account for reward points.\nU.S. Pat. No. 5,025,372, SYSTEM AND METHOD FOR ADMINISTRATION OF INCENTIVE AWARD PROGRAM THROUGH USE OF CREDIT, issued on Jun. 18, 1991 to Meridian Enterprises, Inc. The '372 patent describes an incentive award program in which credit is awarded to participants based on the participant meeting a designated level of performance under the system. This patent does not teach the ability to increase the reward points in a user's account by redeeming points from a pre-existing account such as a frequent flyer mileage program.\nWith regard to FIG. 1, a model of the frequent flyer systems of the prior art is presented. Two different airlines servers are shown surrounded by their related marketing partners, the first grouping labeled Airline 1 100 and the second independently operated but functionally similar grouping labeled Airline 2 200. In order to lure more business travelers, the airlines 100, 200 have established marketing agreements with travel related companies to provide the business traveler with a more robust way to generate rewards in the form or frequent flyer miles. These marketing arrangements or associations have typically involved credit card companies, phone companies, hotel chains and car rental companies. Any purchases made through these \u201cco-branded\u201d partners were then awarded to the user periodically. Bonus miles or points may additionally be accumulated based on the user's actions in response to offers made by the airline or in coordination with the partner company. For example, phone companies offer bonus miles"} +{"output_text": " radiation, are broken and the first and second dyes seep from the broken microcapsules.\nIn accordance with another aspect of the present invention, there is provided a method of forming an image on an image-forming substrate, the method comprising the steps of: (a) providing an image-forming substrate including a base member, a layer of microcapsules, coated over the base member, that contains a first type of microcapsule filled with a first dye, and a second type of", "input_text": "-forming system which comprises an image-forming substrate including a base member, a layer of microcapsules, coated over the base member, that contains a first type of microcapsule filled with a first dye, and a second type of microcapsule filled with a second dye. The first type of microcapsule exhibits a first pressure/temperature characteristic such that, when the first type of microcapsule is squashed and broken under a first predetermined pressure at a first predetermined temperature, the first dye seeps from the squashed and broken microcapsule. The second type of microcapsule exhibits a second pressure/temperature characteristic such that, when the second type of microcapsule is squashed and broken under a second predetermined pressure at a second predetermined temperature, the second dye seeps from the squashed and broken microcapsule. The image-forming system further comprises an image-forming apparatus that forms an image on the image-forming substrate, the image-forming apparatus including a pressure application unit that exerts the first and second predetermined pressures on the layer of microcapsules, the pressure application unit including a transparent plate member, a layer of radiation absorbent material coated over a surface of the transparent plate member, a first platen member elastically pressed against the layer of radiation absorbent material at the first predetermined pressure, and a second platen member elastically pressed against the layer of radiation absorbent material at the second predetermined pressure, with the image-forming substrate being interposed between the first and second platen members and the layer of radiation absorbent material, the image-forming apparatus further including an irradiating unit that irradiates the layer of radiation absorbent material with a first beam of radiation and a second beam of radiation, such that two portions of the layer of microcapsules, encompassed by two local areas of the layer of radiation absorbent material irradiated by the first and second beams of"} +{"output_text": " cord brush cutting tools, they are not as effective as nylon cord brush cutting tools in cutting through tougher vegetation.\nIt is an object of the present invention to provide a vegetation cutting tool which is more robust than known vegetation cutting tools.\nIt is a further object of the present invention to provide a vegetation cutting tool which is more effective than known vegetation cutting tools in cutting through tougher vegetation.\nAccording to the present invention there is provided a vegetation cutting tool comprising a hub and a blade, the hub", "input_text": " design decisions made in existing networks of digital signage displays have been made for a number of differing design decisions. Often the objective is to achieve a uniformity that eases administration and reduces the likelihood of a display being out of order. These systems were often designed for smaller distributions of screens. Networks that claim to have a common administrator often have distinct infrastructures to avoid bottlenecks, and to allow for better network specific templates and content issues.\nIt is, therefore, desirable to provide a digital display network having nodes that reduce the unnecessary consumption of bandwidth and provide additional flexibility in the rendering of content and manner in which they are provisioned. This invention relates to a vegetation cutting tool and to a blade therefor.\nAlthough a main use for the present invention and the following description is in relation to a vegetation cutting tool known as a brush cutter, it is to be understood that as used herein the expression xe2x80x9cvegetation cutting toolxe2x80x9d is not limited to a tool for use with a brush cutter per se. It is to be understood to refer in general to cutting tools which connect to a rotating shaft in a variety of cutting appliances such as, for example, brush cutters, lawn or grass mowers, garden edgers etc.\nA brush cutting tool including a circular hub and a nylon cord is well known. As the hub spins on the shaft of a brush cutter, the nylon cord assumes a radial orientation due to centrifugal force and acts as the cutting member. A major drawback with nylon cord brush cutting tools is that the cord lacks robustness and must be regularly replaced. Furthermore, nylon cord brush cutters struggle to cut through tougher weeds and other vegetation.\nAlso known is a brush cutting tool including a circular hub and a fixed radial blade member. Whilst such arrangements can generally deal with tougher weeds and are more robust than the nylon"} +{"output_text": " in the central database are accessed to provide instructions to the SSPs to complete the call to the subscriber location. The subscriber program may also be used to provide instructions to the SSPs to complete calls to a subscriber location in accordance with special services defined by the subscriber.\nU.S. Pat. No. 4,788,718 to KUROSAWA et al. discloses a system for providing custom incoming telephone call processing services to a subscriber operating at geographically diverse locations. A subscriber", "input_text": "An illustration of the basic components of an AIN network environment is shown in FIG. 1, which is provided to facilitate communication between a plurality of network locations or stations 72-86. As shown in FIG. 1, Central Offices (COs) 64-70 are provided for sending and receiving data messages from a Service Control Point (SCP) 56 via one or more Signaling Transfer Points (STPs) 51, 53 and 59. The data messages are communicated to and from the COs 64-70 and the SCP 56 along a Common Channel Signaling (CCS) network 88. Each CO 64-70 serves as a network Service Switching Point (SSP) to route telephone calls between a calling station (e.g., station 72) and a called station (e.g., station 84) through the trunked communications network 90-93.\nAdditional information regarding AIN and AIN-related network environments, see Berman, Roger K., and Brewster, John H., \u201cPerspectives on the AIN Architecture,\u201d IEEE Communications Magazine, February 1992, pp. 27-32, the disclosure of which is expressly incorporated herein by reference in its entirety.\nA number of features provided by prior AIN or AIN-type intelligent networks relate to specialized call processing of incoming calls. For example, U.S. Pat. Nos. 4,611,094 and. 4,611,096, both to ASMUTH et al., disclose a system for providing custom incoming telephone call processing services to a subscriber operating at geographically diverse locations. A subscriber program stored in a central database is accessed to provide instructions to the SSPs to complete incoming calls to one of the subscriber locations in accordance with special services defined by the subscriber. The subscriber program controls the Action Control Points (ACPs) to string together the desired call processing capabilities to process each call. Specified parameters stored"} +{"output_text": " crystal devices. The methods described include the use of a polymer dispersed liquid crystal (PDLC) and a polymer network liquid crystal (PNLC). The PDLC is a mixture of a liquid crystal material and a polymer material which is dispersed in the liquid crystal material. The PDLC is formed by mixing the liquid crystal material with the polymer material in a suitable solvent and then removing the solvent. The PDLC is then placed between two substrates and the polymer is allowed to cross-link to form a network", "input_text": " required.\nThe above techniques are suitable for many liquid crystal materials including those devices which use liquid crystal materials which exhibit and utilise the smectic mesophase, eg ferroelectrics. Suitable alignment techniques may also be found in GB 2210469 B.\nFerroelectric LCDs by Dijon in Liquid Crystals Applications and Uses, vol 1 Ed. Bahadur, World Scientific Publishing Co. Pte. Ltd. 1990 pp 305-360 and references therein discusses alignment processes for smectic phases for low molar mass materials. The filling of cells is believed to be possible only in the isotropic or nematic phase due to the viscosity of smectic phases. Generally materials with the following phase sequence give good alignment: EQU I.fwdarw.N*.fwdarw.S.sub.A.fwdarw.S.sub.C * or I.fwdarw.S.sub.A.fwdarw.S.sub.C *\nwhereas materials with the following phase sequences are more difficult to align: EQU I.fwdarw.N*.fwdarw.S.sub.C *\nTypically, therefore, in order to use a liquid crystal material in the smectic phase it will involve heating the material to the nematic or isotropic phase and allowing it to cool slowly into an aligned smectic state. Should this technique be applied to a polymer liquid crystal material then the cooling time is usually very much longer in order to assist the alignment, though very often the alignment is poor.\nMaterials and Assembling Process of LCDs by Morozumi in Liquid Crystals Applications and Uses. vol 1 Ed. Bahadur, World Scientific Publishing Co. Pte. Ltd. 1990pp 171-194 and references therein as the title suggests discusses methods for assembling liquid"} +{"output_text": " the data to the DQ line.\nThe row and column conductors are typically formed as a series of metal lines on a semiconductor substrate. The metal lines are separated by an insulating layer, and the insulating layer is covered by a passivation layer. The passivation layer is typically formed of a dielectric material, such as silicon dioxide, silicon nitride, or silicon oxynitride. The passivation layer is formed over the insulating layer and the metal lines to protect the metal lines from corrosion and other damage", "input_text": " in an array of rows and columns. A given memory cell is located in the vicinity of the intersection of a row conductor and a column conductor. The cell is \"accessed\" to perform a data read or write operation when the corresponding row and column conductors are activated. Referring to FIG. 1, a typical DRAM design utilizes a row decoder to select a given row by placing a voltage thereon, which is positive in the case of n-channel field effect access transistors. This row conductor voltage allows all of the access transistors in the selected row to conduct charge between an information storage capacitor and the column conductor associated with each selected cell. Similarly, a column decoder is utilized to select a given column of memory cells for connection to data output (DQ and DQ) lines.\nFor example, if row R1 and column conductor C1 are selected, then data may be written (stored) or read (retrieved) from capacitor 11 by conduction through access transistor M11. Note that a given column conductor (e.g., C1) typically has associated with it a complement column conductor (e.g., C1) that is also selected for the given column. The complement column conductor provides a reference voltage during read operations so that the sense amplifier for the selected column (e.g., sense amp 1) can rapidly determine whether a high voltage, referred to as a \"1\", or a low voltage, referred to as a \"0\", is stored in the selected memory cell. The column functions are interchangeable, so that C1 may be selected to read a given cell (e.g., M21-21), with C1 then serving as its complement conductor, as when row conductor R2 is selected. The selected column conductor (e.g., C1) communicates the data to the DQ line, and the selected complement column conductor (e.g., CI) communicates"} +{"output_text": ". Leblanc et al. on Feb. 9, 1971, and in U.S. Pat. No. 3,795,538 granted to Michel B. Leblanc et al. on Mar. 5, 1974. The former patent discloses a polyimide prepolymer obtained by reacting a tetracarboxylic acid dianhydride with an aromatic diamine, and the latter patent discloses a polyimide prepolymer obtained by reacting a tetracarboxylic acid dianhydr", "input_text": "-layer printed circuit boards.\n2. Description of the Prior Art\nIn the field of fabricating printed circuit boards of a fewer number of layers, there have been widely utilized a various epoxy resins having valuable properties including superior adhesive properties, remarkable resistance to chemical agents, high mechanical strength, and excellent dielectric properties. However, the epoxy resins are found to be no longer appropriate for fabricating multi-layer printed boards in use for constructing high density circuits and modules, since they do not always assure enough resistance to heat which is applied repeatedly to such printed boards for module mounting purposes, in addition to that they frequently reduce reliability in conductivity of conductor layers thereof due to possible resin smear or heat expansion in the direction of thickness of the board. To settle the above problems, heat-stable polyimide resins are presently utilized and found to be satisfactory for the fabrication of the multi-layer printed boards. Particularly, polyimide resins of addition reacted type obtained by reacting unsaturated bisimides with diamines are most acceptable for fabricating the multi-layer printed boards, because the use of that material gives rise to remarkable and advantageous features. Included in the above advantageous features expected by the use of that material are to allow the formation of fine conductive paths as well as the drilling operation of minute holes with high accuracy, which are indispensable for high density circuit arrangement; to keep the heat expansion in the direction of thickness at a minimum for providing increased reliability of conductivity at plated through-holes; to eliminate the deposition of resin smear during drilling operation; to give increased bond strength to the conductor layers as well as increased hardness to the bases of the printed board at elevated temperatures for improving module mounting performance; and to well withstand the continuous operation at an elevated temperature of about 200.degree. C.\nSuch polyimide prepolymers with valuable characteristics have been disclosed in U.S. Pat. No. 3,562,223 granted to Michel B"} +{"output_text": " chuck body) contacts the back surface of the substrate for inhibiting flows of fluid between the first and second pressure-regulatable spaces.\nThe clamp includes a first portion that contacts the front surface of the substrate and a second portion that contacts the back surface of the substrate. The first portion of the clamp is movable relative to the second portion of the clamp to inhibit flows of fluid between the first and second pressure-regulatable spaces. The first portion of the clamp is movable relative to the second", "input_text": " the chucks and substrates for supporting transfers of heat. A second sealing stage collects gas escaping through the first sealing stage into an intermediate space between second portions of the chucks and substrates at a reduced pressure with respect to the pressure at which the gas is confined within the heat-transfer interface.\nThe processing region of the vacuum processor is a first pressure-regulatable space. The first sealing stage together with the first portions of the chuck and substrate forms a second pressure-regulatable space, and the second sealing stage together with the second portions of the chuck and substrate forms a third pressure-regulatable space. Pressure in the third pressure-regulatable space is reduced with respect to pressure in the second pressure-regulatable space to further inhibit leakage of gas from the second pressure-regulatable space into the first pressure-regulatable space.\nOne particular embodiment includes a chuck body having a mounting surface that supports the substrate for processing within the first pressure-regulatable space of the processing chamber. The mounting surface forms together with the substrate a second pressure-regulatable space for assisting transfers of heat between the chuck body and the substrate. A clamp presses the substrate against the mounting surface and forms together with the chuck body and the substrate a third pressure-regulatable space that extends beyond a periphery of the substrate between the first and second pressure-regulatable spaces.\nThe substrate includes a front surface (usually comprising devices in various stages of fabrication) exposed to pressure in the first pressure-regulatable space and a back surface exposed to pressure in the second pressure-regulatable space. The mounting surface contacts the back surface of the substrate for inhibiting flows of fluid (e.g., backside gas) between the second and third pressure-regulatable spaces. The clamp contacts the front surface of the substrate and the chuck body (or an extension of the"} +{"output_text": " oxides (NOx), and that is capable of operating at high firing rates.\nThe radiant gas burner is a well-known device that is used to heat a wide variety of products, including food, beverages, and other products. The radiant gas burner is typically used to heat a product by passing a gas through a flame that is created by a burner. The gas is typically passed through a flame that is created by a burner that is mounted in a combustion chamber. The burner is typically mounted in", "input_text": " this process can benefit by storing downloaded files locally. But then the operating system downloaded over the network is, once again, responsible for the often complex task of managing hardware components and files stored on the local storage device.\nThe computer management software method is used to enhance the operating system by adding additional software components as agents, daemons, or services. One typical way of using this method is to use anti-virus software that constantly scans stored files for any computer virus infection. This method may also be implemented by adding a software component that constantly monitors important files on the local storage device and attempts to self-heal any damaged or corrupted files. An additional implementation adds a software component that handles file updates pushed out from a server as a part of a computer management tool. The drawback of this method is that the software components acting as agents, daemons, or services are highly dependent on the operating system. The operating system has to provide necessary functions, such as managing local storage devices or network interfaces, for these software components to work properly.\nMany operating systems can also apply file level or directory level security to provide certain level of protection against computer viruses, unauthorized access, user errors, or application errors that can corrupt important files. The drawback of this method is that it is operating system dependant, and a super user, an administrator, or a process running with full access privileges can accidentally modify, delete, or corrupt important files in the local storage.\nThe above methods, by themselves or in combination with other methods, provide some help in reducing the complexities involved with computers. However, none of the methods fundamentally changes how the operating system manages the components of a computer. Thus, a new approach is needed for managing computers and simplifying the manner in which files are distributed over a network. This invention relates to a broadly modulated radiant gas burner that yields minimal emissions of air-pollutants, especially nitrogen"} +{"output_text": " example of such a system is the Universal Mobile Telecommunications System (UMTS) which is a 3GPP standard for a mobile communication system that is based on Wideband Code Division Multiple Access (WCDMA) and is designed to support high speed packet based communication services based on the Global System for Mobile Communications (GSM) and General Packet Radio Service (GPRS) network standards.\nThe 3GPP standard is based on the GSM standard, and is therefore also referred to as", "input_text": " first sealing ring 70 and the auxiliary lip 73b. Thus, it is possible to prevent the pulser ring 76 from being contaminated by abraded metal powder generated by rolling of the balls. Accordingly, this keeps the detecting accuracy. (See, Japanese Laid-open Patent Publication No. 98332/2005)\nIn the prior art wheel bearing apparatus incorporated with a wheel speed detecting apparatus, the female connector 81 is formed to project radially outward and vertically to the axis of the bearing. Thus, a strong pressing force is applied to the first sealing ring 70, the magnetic sensor 80, via the connector 81, and thus the sensor holder 79 during connection of the connector 81. Accordingly, the positional accuracy of the first sealing ring 70 and the magnetic sensor 80 tends to be diminished. Additionally, the sensor holder 79 itself, made of plastic resin, may be broken. 1. Field of the Invention\nThe present invention relates to mobile communication networks, and especially to transmission signal defer controlling e.g. in Long Term Evolution (LTE) networks.\n2. Description of the Related Art\nThe evolution of cellular wireless communication systems has been marked with different generations. 1st generation (1G) included analog systems such as AMPS (Advanced Mobile Phone System) and NMT (Nordic Mobile Telephone) cellular phone networks, introduced in the early 1980s. The second generation (2G) introduced digital cellular telephony such as the GSM (Global System for Mobile Communications) standard, introduced in the early 1990s, which was standardized by the European Telecommunication Standards Institute (ETSI). GSM applies Time Division Multiple Access (TDMA) based radio interface. GSM is still the most widespread standard used in mobile communications.\nAfter the 2G networks, 3rd Generation Partnership Project (3GPP) has standardized globally applicable system specification for 3rd generation mobile communication system. An"} +{"output_text": ", a means for eliminating the effects of undesired stray signals in its circuits, that is simple in design and that is inexpensive to produce;\n(B) In an electrical device e.g. a plug and socket, a means for eliminating the effects of undesired stray signals in its circuits, that is easily adaptable to single-wire or multi-wire connections;\n(C) In an electrical device e.g. a plug and socket, a means for eliminating the effects of undes", "input_text": " from a sensor or thermocouple are first attached, for convenience, to an electrical connector, for coupling to an instrument or the like. As soon as the conductors of the connector are electrically coupled to the circuit, a potential antenna is created. Of particular importance are thermocouple connectors which are particularly susceptible to the antenna-effect because of the long lead conductive path of the conductors themselves and the high input impedance of the instrumentation.\nThis invention applies to the fields of use wherein there is necessity for including an electrical device in a low level signal circuit, e.g. a thermocouple sensor, and provides, a new apparatus for eliminating undesired electromagnetically-induced stray signals.\nAccordingly, it is an object of this invention to provide means associated with an electrical device that is capable of removing stray signals that may be induced in the leads of the device.\nIt is another object of this invention to provide an apparatus for eliminating the effects of undesired stray signals in circuits, that is uncomplicated in design, and that is simple and relatively inexpensive to produce.\nIt is still another object of this invention to provide means associated with an electrical device for eliminating the effects of undesired stray signals in its circuits, that is easily adaptable to single-wire or multi-wire connections.\nAnother and further object of this invention is to provide means associated with an electrical device for eliminating the effects of undesired stray signals in its circuits, that can be manufactured easily in various configurations to accommodate differing circuit requirements.\nAnd yet another and further object of this invention is to provide means associated with an electrical device for elimination of the effects of undesired stray signals in its circuits in which operative elements of the device may be configured to facilitate insertion and removal of wire conductors.\nOther objects are to provide:\n(A) In an electrical device e.g. a plug and socket"} +{"output_text": " signal processing apparatus which comprises: signal receiving means for receiving an input signal containing an encoded video signal and an encoded voice signal, and for recovering from the input signal the encoded video signal and encoded voice signal for output; video signal extracting means for extracting a designated frame from the encoded video signal, and for outputting the extracted frame as a representative video signal; video signal processing means for applying prescribed processing to the representative video signal, and for outputting the thus processed signal as a video signal; and voice", "input_text": ". 9, a prior art sound signal processing apparatus 1 is an apparatus that accepts at its input a reproduced sound signal A of a digital sound signal transmitted from a sound signal transmitting apparatus 2, and that converts it into an analog sound signal and outputs the analog sound signal as an output sound signal B.\nIn recent years, advances have been made in sound quality and sound multiplexing in stereo broadcasting or the like, and the reproduced sound signal A from the sound signal transmitting apparatus 2 may contain a large amount of information.\nHowever, since the above prior art sound signal processing apparatus 1 converts the reproduced sound signal A into an analog sound signal in the order in which it is input, if the reproduced sound signal A is a signal containing a large amount of information, problems will occur, such as interruptions in the sound corresponding to the reproduced sound signal A, unless the speed with which the sound signal processing apparatus 1 converts the reproduced sound signal A into the analog output sound signal B is fast enough.\nA first aspect of the present invention concerns a video and voice signal processing apparatus which comprises: signal receiving means for receiving an input signal containing an encoded video signal and an encoded voice signal, and for recovering from the input signal the encoded video signal and encoded voice signal for output; video signal extracting means for extracting a designated frame from the encoded video signal, and for outputting the extracted frame as a representative video signal; video signal processing means for applying prescribed processing to the representative video signal, and for outputting the thus processed signal as a video signal; and voice signal processing means for applying prescribed processing to the encoded voice signal, and for outputting the thus processed signal as a voice signal.\nIn this configuration, frame decimation is applied to the encoded video signal, but is not applied to the encoded voice signal; therefore, all voice signals can be output unchanged.\nThis invention also provides a video and voice"} +{"output_text": " be realized by a simple circuit.\n2. Linear Interpolation Method\nThis method consists in picking up data at the position of the pixel following pixel number conversion from pixel data of an input picture, and can be realized by a simple circuit.\n3. Non-Linear Interpolation Method\nThis method consists in picking up data at the position of the pixel following pixel number conversion from pixel data of an input picture, and can be realized by a simple circuit.\nThe above-described", "input_text": " the above-described processing for conversion of the number of pixels is hereinafter explained.\nThe processing for conversion of the number of pixels is the processing for increasing or decreasing the number of output pixels to a desired number with respect to the number of input pixels during one scanning line period. Supposing that the sampling frequency of the input is equal to that of the output, the increased number of the pixels or the decreased number of the pixels are tantamount to an enlarging processing or to a contracting processing of an input picture (processing of changing the number of pixels in an increasing direction or in a decreasing direction). If attention is directed to the sampling of the input and output pixels instead of to the number of pixels, the above technique is tantamount to creating data at points different from original sampling points or to generating interpolated pixels at these different points from the input pixel data.\nThe conversion processing for the number of scanning lines for interlaced and non-interlaced pictures is hereinafter explained.\nThe conversion of the numbers of scanning lines is a processing of changing the number of output lines to a desired value from the number of the input scanning lines during each vertical scanning period (processing of converting the number of lines in an increasing direction or in a decreasing direction). Supposing that the number of lines of the input is equal to that of the output, the increased number of the lines or the decreased number of the lines are tantamount to an enlarging processing or to a contracting processing of an input picture in the vertical direction. Thus, the conversion processing for the number of scanning lines means interpolation of line data.\nThere are a variety of interpolating methods, which may roughly be classified into the following three methods:\n1. Nearest Neighbor Interpolation Method\nThis method consists in picking up data at the closest position to the pixels following pixel number conversion from pixel data of an input picture, and can"} +{"output_text": " The RF coils are used to add energy to the nuclear spin system in a controlled fashion. As the nuclear spins then relax back to their rest energy state, they give up energy and produce an RF signal. This signal is detected by the RF coil, and after appropriate processing, may be used to produce images.\nWhen a substance such as human tissue is subjected to a uniform magnetic field (polarizing field B0), the individual magnetic moments of the spins in the tissue attempt to align with this polar", "input_text": " switch to respond by stopping or reversing the tape deck mechanism. The need for switches and monostable multivibrators in a special sensing circuit again increase the cost of the cassette tape deck equipped with such a system.\nThe present invention is directed to overcome the above disadvantages noted in conjunction with prior art systems and to provide a new system which surpasses the prior art in efficiency and simplicity. Magnetic resonance imaging (MRI) is a medical imaging modality that can create images of the inside of a human body without using X-rays or other ionizing radiation. MRI uses a powerful magnet to create a strong, uniform, static magnetic field (i.e., the \u201cmain magnetic field\u201d). When a human body, or part of a human body, is placed in the main magnetic field, the nuclear spins that are associated with the hydrogen nuclei in tissue water become polarized. This means that the magnetic moments that are associated with these spins become preferentially aligned along the direction of the main magnetic field, resulting in a small net tissue magnetization along that axis (the \u201cz-axis,\u201d by convention). A MRI system also comprises components called gradient coils that produce smaller amplitude, spatially varying magnetic fields when current is applied to them. Typically, gradient coils are designed to produce a magnetic field component that is aligned along the z axis (i.e., the \u201clongitudinal axis\u201d), and that varies linearly in amplitude with position along one of the x, y, or z axes. The effect of a gradient coil is to create a small ramp on the magnetic field strength, and concomitantly on the resonance frequency of the nuclear spins, along a single axis. Three gradient coils with orthogonal axes are used to \u201cspatially encode\u201d the MR signal by creating a signature resonance frequency at each location in the body. Radio frequency (RF) coils are used to create pulses of RF energy at or near the resonance frequency of the hydrogen nuclei."} +{"output_text": " of the roll paper driving motor.\nHowever, in the technology disclosed in Japanese Patent Application Laid-open No. S62-83968, the roll paper driving motor is controlled in a manner following the amount of displacement of the movable member. Therefore, if the movable member is displaced due to a disturbance, the roll paper driving motor may be controlled in a manner following the amount of displacement of the movable member. As a result, the roll paper driving motor may be controlled in a manner following", "input_text": " roll paper is directly pulled by carriage rollers, load fluctuation acts as a disturbance to the control of a conveying motor driving the carriage rollers, resulting in unstable control and therefore desired stop position precision may not be obtained. In addition, if roll paper having a large moment of inertia is directly pulled by the carriage rollers, the carriage rollers might slip on the roll paper to cause the amount of paper to be fed to change, resulting in print deviation even if the conveying motor is precisely controlled.\nTherefore, such a roll paper compatible printer includes a driving source for unwinding the roll paper and pulling to covey the roll paper toward the carriage rollers. The driving source for unwinding and pulling the roll paper is provided separately from the conveying motor for driving the carriage rollers. Furthermore, a movable member is arranged upstream of the carriage roller so as to come into contact with the unwound roll paper and apply an optimal tension thereto. Thus, control of the conveying motor may be prevented from being affected adversely by a fluctuation in the load, which fluctuates as the amount of the remaining roll paper changes. Moreover, even if the roll paper slips, the amount of paper to be fed into the printing position can be controlled precisely.\nIn addition, as a technology related to conveyance of roll paper, Japanese Patent Application Laid-open No. S62-83968 discloses a technology that continuously detects the amount of displacement of a movable member, and increases or decreases the degree of roll paper to be unwound in a manner following the amount of displacement of the movable member to continuously manage pulling and conveying of the roll paper as well as tension thereof. More specifically, the technology disclosed in Japanese Patent Application Laid-open No. S62-83968 converts an encoder pulse corresponding to rotations of a roll paper driving motor for driving a roll paper shaft into a voltage, compares the voltage with a reference voltage, and performs proportional control"} +{"output_text": " a glass substrate of the present invention are characterized in that the glass material is placed in the forming die, and the glass material is molded into a glass substrate in the shape of a parallel plate by pressure.\nA second method for manufacturing a glass substrate of the present invention includes placing a glass material in a forming die including a pair of upper and lower dies and a control member for controlling the space between the upper and the lower die, and molding the glass material into a glass substrate in the shape of", "input_text": " diameter X and a thickness Y satisfies X greater than 40 Y.\nIt is preferable that the control member of the forming die of the present invention controls the space between the upper and the lower die to be 1 mm or less.\nFurthermore, it is preferable that at least one of the upper and the lower die is provided with a concave in the central portion thereof to specify the die used. In addition, a plurality of concaves can increase the number of types of dies to be distinguished.\nNext, a glass material to be molded into a glass substrate of the present invention is used for manufacturing a glass substrate in the shape of a parallel plate with a forming die including a pair of upper and lower dies and a control member for controlling the space between the upper and the lower die. The glass material has an amount of thermal contraction larger than that of the control member. The glass material is shaped so that it comes into point-contact with the forming die when placed therein, and as the contact portion between the glass material and the forming die is increased by pressure molding, the glass material is transformed continuously so as to prevent air from entering the contact portion.\nA first method for manufacturing a glass substrate of the present invention includes placing a glass material in a forming die including a pair of upper and lower dies and a control member for controlling the space between the upper and the lower die, and molding the glass material into a glass substrate in the shape of a parallel plate by pressure. The amount of thermal contraction of the control member is smaller than that of the glass material. The glass material comes into point-contact with the forming die when placed therein, and as the contact portion between the glass material and the forming die is increased by pressure molding, the glass material is transformed continuously so as to prevent air from entering the contact portion.\nThe glass material to be molded into a glass substrate and the first method for manufacturing"} +{"output_text": " gas is compressed by the rotary compression element 118.\nThe driving element 114 is a motor that is driven by an electric power source (not shown) and is disposed at the upper space inside the sealed container 112. The rotary compression element 118 is a rotary compression element that is driven by the driving element 114 and is disposed below the driving element 114. The rotary compression element 118 includes the first rotary compression element 132 and the second rotary compression element 134. The first rotary compression element 132 is a rotary compression", "input_text": "/000833, WO 05/082089 (U.S. Publication No. 2007/0203100), WO 06/047195, WO 06/100633, WO 06/115188, WO 06/131336, WO 2007/024922, WO 07/109,330, WO 07/116,866, WO 08/023,783 (U.S. Publication No. 2008/0200535), WO 08/029,370, WO 08/114,157, WO 08/074,820, WO 09/043,889, WO 09/057,079, and U.S. Pat. No. 6,069,143. Also see Hale et al., J. Med. Chem., 47:6662 (2004).\nThere still remains a need for compounds useful as S1P1 agonists and yet having selectivity over S1P3.\nApplicants have found potent compounds that have activity as S1P1 agonists. Further, applicants have found compounds that have activity as S1P1 agonists and are selective over S1P3. These compounds are provided to be useful as pharmaceuticals with desirable stability, bioavailability, therapeutic index, and toxicity values that are important to their drugability. The present invention relates to a rotary compressor that includes a driving element and a rotary compression element inside a sealed container.\nBoth currently and in the past, a vertical rotary compressor has a configuration shown in FIG. 6, where a driving element 114 is disposed at an upper space inside a vertical cylindrical sealed container 112, and a rotary compression element 118 including a first rotary compression element 132 and a second rotary compression element 134 driven by a rotary shaft 116 of the driving element 114 is disposed below the driving element 114. The rotary compressor 110 is a so-called internal high-pressure-type multi-stage compressing compressor in which a refrigerant"} +{"output_text": " memory cell array 102. In this situation, the voltage V.sub.pp applied across the source and drain of the transistor can cause the transistor to be forward biased. As a result, the transistor can be turned on, thereby causing the transistor to conduct current. This current can cause the voltage at the node to which the transistor is connected to rise above the voltage V.sub.pp. This rise in voltage can cause the transistor to be turned on again, thereby causing the voltage at the node", "input_text": " will be set to reflect that an erase error has occurred.\nAssuming that the second byte of cells has been properly erased, the remaining bytes will be verified and any necessary additional erase pulses will be applied. Once the last address has been verified, the erase sequence is ended and status register 126 is updated to indicate that the erase sequence has been successfully completed.\nWith reference again to FIG. 3 (and to FIG. 5), memory chip 103 includes a conventional circuit 130 (shown in FIG. 3) for monitoring the status of one or more nodes of the chip. In response to detecting a condition which requires that the normal operational flow of the state machine be stopped (denoted below as an \"illegal\" condition), monitoring circuit 130 asserts a signal (labeled \"ILLEGAL\" in FIG. 3) having a high level (a logical \"one\") to command execution logic 124. In the absence of an illegal condition, the signal \"ILLEGAL\" is at a low level (a logical \"zero\"). In response to a high level of the signal \"ILLEGAL,\" logic unit 124 (and in particular, logic circuitry 124A shown in FIG. 5 within unit 124) generates a halt signal (labeled \"HALT\" in FIG. 5) for causing the halting of circuit operation). Conventional logic circuitry 124A is typically implemented using combinational logic. Logic unit 124 asserts the halt signal to state machine 120 and optionally also to other components of chip 103. In response to the halt signal, state machine 120 halts (aborts or otherwise stops) any ongoing memory erase operation.\nHowever, a problem can arise as a result of the described conventional technique for halt signal generation. This problem occurs when the halt signal is generated at a time when high voltage (e.g., voltage V.sub.pp) is applied across the source and drain of one or more transistors of"} +{"output_text": "The present invention relates to a method for manufacturing a semiconductor device, and more particularly, to a method for manufacturing a semiconductor device having a metal gate.\n2. Description of the Related Art\nAs the integration of semiconductor devices increases, the size of a gate electrode of a transistor is reduced. As a result, the thickness of a gate insulating layer is reduced. However, as the thickness of the gate insulating layer is reduced, the leakage current of the gate insulating layer increases.\nTo reduce the", "input_text": "A method for providing clock signals on a chip, in accordance with the present invention, includes the step of providing a universal clock generator circuit comprising an oscillator unit including circuitry for providing a first clock frequency, and a plurality of load blocks, the load blocks being selectively connectable to the oscillator such that a range of clock rates are derived from the first clock frequency by selectively connecting a number of the load blocks to the oscillator unit to provide one of a plurality of clock frequencies from a same output. The method further includes the steps of placing the universal clock generator circuit at a plurality of locations on a semiconductor chip and trimming the load blocks to provide a clock frequency from each universal clock generator circuit on the semiconductor chip.\nIn other methods, the step of trimming the load blocks may include the step of providing connections and disconnections between the load blocks while forming the load blocks. The step of trimming the load blocks may include the step of providing connections and disconnections between the load blocks by respectively employing anti-fuses and fuses between the load blocks. The step of trimming the load blocks may include the step of providing connections and disconnections between the load blocks by employing transistors between the load blocks to enable and disable connections. The step of trimming the load blocks may include the step of eliminating load blocks from a layout to reduce layout area. The step of trimming the load blocks may also include the step of providing connections and disconnections between the load blocks by programming transistors between the load blocks to enable and disable connections. The step of programming transistors may include programming the transistors by employing a memory storage device. The transistors may be programmed after the semiconductor chip has been fabricated or packaged.\nThese and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings. 1. Field of the Invention\n"} +{"output_text": "xe2x80x9d between the RF receiver coils and the room temperature pipe.\nIn a preferred embodiment of the invention, the radiation shields are made of a material which is transparent to RF fields and has a low thermal conductivity. The radiation shields are preferably made of a material which is transparent to RF fields and has a low thermal conductivity. The radiation shields are preferably made of a material which is transparent to RF fields and has a low thermal conductivity. The radiation shields are preferably made of a", "input_text": " metal strips having a thickness of approximately 100 xcexcm and a width transverse to the z axis of approximately 0.5 mm to 2 mm, preferably approximately 1 mm.\nA particularly preferred embodiment of the NMR probe head in accordance with the invention provides for radiation shields disposed between the RF receiver coil system and the room temperature pipe which surround the room temperature pipe in a radial direction, extend in the z direction and are made of one or more materials oriented in the z direction which are almost completely transparent to RF fields or at least have an absorption of less than 5%, preferably less than 1% for RF fields.\nAlthough cryotechnology has used radiation shields for some time to curtail heat radiation losses, this procedure is not directly applicable for a cooled NMR probe head since the normally metallic radiation shields, which reflect heat radiation, either completely block or at least strongly impair propagation of RF fields from the measuring sample to the RF receiver coils such that the incoming NMR signals are at least highly attenuated, distorted or completely unusable.\nIn accordance with the inventive solution, the radiation shields provided in the vacuum between the RF coils and the room temperature pipe solely comprise materials which are oriented in the z direction. The axial orientation of the radiation shield material prevents their finite susceptibility from impairing the resolution of the NMR signals. On the other hand, the physical properties of the materials should be such as to effect as large a transparency as possible in the region of radio frequency radiation. In most cases, this material property has the associated disadvantage that reflection of lost heat back towards the measuring sample is not very high.\nIt is advantageous if the radiation shields have at least a minimum separation from one another in the radial direction and do not contact each other or at the most contact at points or linearly to prevent direct heat conduction between the individual radiation shields in a radial direction which would lead to a thermal xe2x80x9cshort circuit"} +{"output_text": ".\nHowever, the prior art does not provide a mask that is adjustable to fit a wide range of facial shapes and sizes. The prior art also does not provide a mask that is adjustable to fit a wide range of facial shapes and sizes, and that is customizable to fit a wide range of facial shapes and sizes. The prior art also does not provide a mask that is adjustable to fit a wide range of facial shapes and sizes, and that is customizable to fit a wide range of facial", "input_text": " higher density ROM device. 1. Field of the Invention\nThe present invention relates to a face mask apparatus, and more particularly, to a face mask apparatus for sleep apnea treatment, and methods of making same.\n2. Description of the Prior Art\nSleep apnea is a condition characterized by pauses in breathing or shallow breaths while sleeping. The pauses in breathing may last for a few seconds to a few minutes, and may occur more than 30 times an hour. Left untreated, sleep apnea leads to excessive daytime sleepiness and an increased risk of high blood pressure, heart attack, stroke, obesity, diabetes, and heart failure.\nTreatment options for sleep apnea generally include lifestyle changes (e.g., weight loss, avoiding sleeping on one's back, avoiding alcohol, smoking cessation), surgery, mouth pieces, and breathing devices. The most common treatment for sleep apnea is a continuous positive airway pressure (CPAP) or automatic positive airway pressure (APAP) device. These devices blow pressurized air via a hose to a nasal pillow, nose mask, or facial mask at a pressure high enough to splint the airway open during sleep.\nIt is known in the prior art to provide sleep apnea treatment medical devices. Currently, masks are made from plastic, and are adjusted by either foam or gel to suit the comfort level of the patient. In addition, the gel or foam is also utilized to form the seal between the mask and the patient's skin. The seal is an important part of the effectiveness of the mask because it ensures that air does not leak out.\nIt is also known in the art to provide customizable masks for facial application. It is further known in the art to use computer aided design for custom face mask design and manufacture. And it is also known to provide three-dimensional (3D) facial data for use for fabrication of a custom fit mask"} +{"output_text": " coefficient K1 of the steam turbine 202 by the steam turbine output 211. The gas turbine output 213 is estimated on the basis of the gas turbine inlet steam pressure 232 by multiplying the correction coefficient K2 of the gas turbine 201 by the gas turbine output 211.\nIn the conventional single fuel type (a natural gas combustion type) single shaft combined plant using a premix burner in the gas turbine combustor 214 as shown in FIG. 2, the steam turbine output 212 is estimated on the basis of", "input_text": " ratio, it is enough to know the shaft output 211 of the generator G, which is the total of the output of the steam turbine 202 and the output of the gas turbine 201 (a gas turbine output 213). It is not necessary to know the output of the steam turbine 202 and the gas turbine output 213 separately.\nOn the contrary, an amount of air provided into a compressor C\u2032 and an amount of air provided into a burner 314 are controlled on the basis of the gas turbine output 213 (on the MW basis). However, the gas turbine output 213 can not be directly measured. So, an operating unit 210 calculates an estimated output 212 (MW conversion) of the steam turbine, which is estimated to be outputted from the steam turbine 202. Then, a subtractor calculates the gas turbine output 213 by subtracting the estimated output 212 of the steam turbine from the measured shaft output (on the MW basis) of the generator G\u2032.\nA combustor bypass valve opening instruction 216 and an IGV opening instruction 217 are calculated so that a stable combustion situation can be obtained in a gas turbine combustor 214 on the basis of the calculated gas turbine output 213. A combustor bypass valve 220 and an IGV 221 are driven to thereby control the amounts of air introduced into the compressor C\u2032 and the combustor 214. That is, a first function generator 204 receives a signal indicating the gas turbine output 213 and outputs a signal 216 indicative of an optimal combustor bypass valve opening. A second function generator 205 receives a signal indicating the gas turbine output 213 and outputs a signal 217 indicating an optimal IGV opening.\nIn a conventional single fuel type (a natural gas combustion type) single shaft combined plant using a premix burner in the gas turbine combustor 214 as shown in FIG. 2, the steam turbine output 212 is estimated on the basis of an intermediate turbine inlet steam pressure 231 by multiplying the correction"} +{"output_text": " have been proposed, and the concept has been applied to a variety of different problems.\nPrediction markets have been used to predict the outcome of elections, the outcome of sporting events, the outcome of stock market trades, the outcome of political elections, the outcome of auctions, the outcome of auctions for government contracts, the outcome of auctions for corporate control, the outcome of auctions for corporate control in the presence of hostile takeovers, the outcome of auctions for corporate control in the", "input_text": " other directly, through, for example, persuasion, orders, providing information and acting as role models, and they adjust or modify their behavior in consequence of their interaction with other individuals, and the environment in which they operate. Communication, and thus social comparison, depends on the individuals. New issues promote discussion and comparison as agreement patterns emerge.\nPrediction Markets (PM)\nMarkets are considered to be a method of allocating resources among competing uses. Markets can also be used when there is an absence of an arbiter with helpful information. Prices ensure that the different agents competing for access have a common standard for comparison across different choices. The market clearing process ensures that each resource is assigned to its best use. Different market designs satisfy different purposes.\nContinuous double auction markets provide goods on demand to buyers who are willing to pay the going rate, while call or options markets improve prices for buyers and sellers when time is not the most crucial factor, allowing for hedging and risk allocation. When a plurality of buyers have needs for different goods, which also are interdependent, combination markets may be necessary.\nPrediction markets are a form of market in which the goods being traded are securities whose values are determined by the outcomes of future events. The securities are structured so that trading between buyers and sellers causes the price to reflect the probability of the underlying event. When a trader sees a market price (probability) that is less than her expected probability for the event, she will see a profit opportunity in buying more, thus likely driving the price up. The new price reflects a higher probability to others monitoring the market.\nPrediction markets have been applied to a variety of problems and questions. Several variations on the basic idea have been described, making it possible to find answers for many different types of questions, or apply the concept in a wider set of circumstances. New variations of the original prediction market concept"} +{"output_text": " gas supply duct. The nozzle is preferably designed as a hollow body which is open at the end opposite the head. The nozzle is preferably designed as a hollow body which is open at the end opposite the head and which is closed by a cap. The nozzle is preferably designed as a hollow body which is open at the end opposite the head and which is closed by a cap. The nozzle is preferably designed as a hollow body which is open at the end opposite the head and which is closed by a cap", "input_text": " closed by the loading member in a first position of the loading member and can be connected to the inlet in a second position of the loading member. The loading member adopts the second position during the welding process, so a gas can pass via the gas supply duct and through the loading member to the weld stud which has a through-duct. Admission of the melt during the welding process into the fastening element designed substantially as a hollow body is prevented by the supply of a gas to the welding position. With respect to the gas supply duct formed in the guide duct, the loading member acts as a valve which only clears the gas supply duct when the welding process is carried out.\nA design of a welding device in which the guide duct and the loading member have a substantially circular cross section is preferred. The loading member can be a tubular component of which the end opposite the head is hermetically sealed.\nAccording to a further advantageous embodiment of the welding device, it is proposed that the loading member be guided in a substantially gas-tight manner in the guide duct. This embodiment of the welding device is advantageous if the gas is an inert gas, in particular an argon-containing inert gas. The consumption of the gas is reduced by the gas-tight design so the economic viability of the welding device is further improved.\nAccording to a further advantageous embodiment of the welding device, it is proposed that the inlet into the duct of the loading member be designed in the form of an elongated recess which extends in the longitudinal direction of the loading member. The length of the recess or of the inlet is calculated such that it corresponds at least to a displacement path of the loading member during the welding process. This ensures that the supply of gas is not interrupted during the advance of the loading member.\nAccording to a further advantageous embodiment of the welding device, it is proposed that the loading member have a nozzle with the"} +{"output_text": "oelectromechanical systems (NEMS). NEMS devices are fabricated using the same microfabrication techniques used to fabricate integrated circuits. The NEMS devices are typically fabricated on a silicon substrate using standard integrated circuit fabrication techniques. The NEMS devices are typically fabricated using the same microfabrication techniques used to fabricate integrated circuits. The NEMS devices are typically fabricated using the same microfabrication techniques used to fabricate integrated circuits. The NEMS devices are typically fabricated", "input_text": "With the method according to the invention for welding a fastening element, in particular a weld stud, on a workpiece, it is proposed that a fastening element comprising a through-duct be supplied to a conveying and holding system comprising a guide member with a guide duct and be conveyed by a loading member capable of moving to and fro in the longitudinal direction of the guide duct to one end of the guide duct and be held therein during the welding process, the loading member closing, in a first position, a gas supply duct which opens into the guide duct and clearing it in a second position so a gas flows through a duct formed in the loading member and the through-duct in the fastening element to the welding position. In the 21st century, people are developing smart materials and smart sensors. The smart sensors, made of smart materials, provide in association with other components like actuators and control systems, the functional capability to react to internal and external environments and achieve adaptability. Examples include, but are not limited to, life science research involving the study of interaction between biological and other molecules, in-vitro diagnostics, food safety whereby bacteria and toxins are detected without the need for time-consuming growth experiments, fresh water control involving the detection of heavy metal ions in fresh water or terror-related compounds, such as, ricin in fresh water supplies and gas detection, detection of explosives, chemical warfare agents, narcotics and the like.\nThe realization that many molecular phenomena result in mechanical responses at the nanoscale level promises to bring about a revolution in the field of chemical, physical, and biological sensor development. Exploiting nanoscale mechanics for molecular recognition is a paradigm shift in sensor technology. In a quest for smaller, faster, better, smarter sensors, the micro-electromechanical systems (MEMS) have been scaled to the submicron range, leading to the new category of nan"} +{"output_text": " the case of a high level of circuit integration.\nThe cost of a chip is also increased by the fact that the test patterns are generated by an external tester. The external tester is typically slower than the chip and, therefore, prevents the test from being run at speed.\nThe cost of a chip is also increased by the fact that the test patterns are generated by an external tester. The external tester is typically slower than the chip and, therefore, prevents the test from being run at speed.", "input_text": " externally generated test pattern is supplied to an input of the chain, and a response on an output of the chain is compared to a known \"correct\" response. If a block in the chain has a \"stuck-at fault,\" the output response will be different from the correct response.\nAlthough Scan is good at detecting \"stuck-at\" faults, it is not very good at detecting \"timing faults\" because it cannot be run \"at speed.\" Scan requires an external tester to generate and supply the test pattern to the chip and evaluate the response of the chip. However, the external tester generally runs slower than the clock speed of the chip and, therefore, prevents the test from being run at speed.\nMoreover, Scan is costly to implement. With each switch that is added to the chip and with each register that is modified or added, this increases overhead and accordingly the cost of the chip is increased. The cost increase typically results from an increase in die area and an impact on the design schedule. Scan can increase the cost of a chip by 5% to 20%.\nThe BIST architecture generally involves the construction of a test pattern generator (\"TPG\") and a test answer evaluator (\"TAE\") on a chip. The TPG generates test patterns for blocks under test and the TAE evaluates the responses of the blocks to the test patterns. Although good at detecting timing faults, BIST increases cost of the chip by 10% to 20%.\nScan BIST is a combination of Scan and BIST. Switches, registers, a TPG and a TAE are all constructed on a chip. However, the increase in cost is the greatest among the three test architectures, typically between 10% and 25%.\nWith all three architectures, the cost of a chip is increased disproportionately as the level of circuit integration is increased. Thus, the increased cost can become quite significant in"} +{"output_text": " string is generated by a portable terminal device and is then encoded into a binary number. The binary number is then transmitted to a server computer. The server computer decodes the binary number and compares it to a stored binary number. If the two numbers match, the user is authenticated. If the two numbers do not match, the user is denied access to the particular service.\nU.S. Pat. No. 5,724,423 discloses a user authentication service which is both highly secure", "input_text": " No. 5,428,349 discloses a password access method/algorithm effected by generating a pseudorandom array of each letter of the alphabet and the numerals 0 through 9 such that the password entry can be monitored without disclosing the letters or numerals contained in the password. The preferred arrangement is a square matrix of six rows and six columns of characters. The user enters the password by selecting either the row or column containing each letter of a memorized password.\nU.S. Pat. No. 5,478,994 discloses a secure credit card 10 having a body member to which is attached a microprocessor controller electrically coupled a Programmable Read Only Memory (PROM) device programmed with a series of random numbers in a predetermined sequence. The random numbers are identical to random numbers in a host computer and in the identical sequence as the random numbers in the host computer. This computer is accessible upon each use of the credit card. The Programmable Read Only Memory (PROM) accesses the next random number in sequence with each use of the credit card to permit verification by comparing the random number with each use of the credit card with the next random number in sequence as indicated by the computer. A switch actuated with each use of the credit card provides a pulse signal that activates the microprocessor controller to turn on the Programmable Read Only Memory (PROM) to access the next random number in the sequence. A counter connected to the microprocessor controller counts the number of pulse signals received to count each use of the credit card. A display device displays the next Personal Identification Number (PIN) in the sequence each time a pulse is received.\nU.S. Pat. No. 5,724,423 discloses a user authentication service which is both highly secure and user friendly. To access a particular service, a user simply enters a PIN using a portable terminal device which encodes the PIN. More specifically, a character"} +{"output_text": " energy to interference power.\nThe base station uses the TADD and TDROP thresholds to determine whether to send a PSMM to the mobile station. If the TADD is greater than TDROP, the base station sends a PSMM to the mobile station. If the TDROP is greater than TADD, the base station does not send a PSMM to the mobile station.\nThe base station uses the TADD and TDROP thresholds to determine whether to send", "input_text": "\n2. The strength of a Candidate Set pilot exceeds the strength of an Active Set pilot by more than a threshold (TCOMP)xc3x970.5 dB, and a PSMM carrying this information has not been sent since the last Handoff Direction Message (HDM) or Extended Handoff Direction Message (EHDM) was received.\n3. The strength of a pilot in the Active Set of Candidate Set has fallen below a threshold (TDROP) for greater than a predetermined time period (TTDROP), and a PSMM carrying this information has not been sent since the last HDM or EHDM was received.\nTADD is threshold above which the received signal is of sufficient strength to effectively provide communications with the mobile station. TDROP is a threshold value below which the received signal energy is insufficient to effectively provide communications with the mobile station.\nIn an IS-95B communication system, the mobile station sends an autonomous PSMM according to one of two sets of rules as chosen by the base station. The first set of rules is the same as the rules specified in IS-95A. The second set of rules uses a dynamic threshold defined as: T DYN = SOFT_SLOPE 8 xc3x97 10 xc3x97 log \u2062 \u2211 i \u2208 A \u2062 ( Pilot \u2062 xe2x80x83 \u2062 Ec / Io ) i + ADD_INTERCEPT 2 ,\nwhere the parameters SOFT_SLOPE and ADD_INTERCEPT are specified by the base station and the summation is performed over all pilots in the Active Set. Ec/Io is the ratio of pilot"} +{"output_text": " for the elimination of the temperature peaks in the roll jackets are limited.\nAnother measure that has been tried is the use of a roll jacket made of a material having a higher thermal conductivity than the cellulose material of the roll filling. Such a roll jacket is described, for example, in DE-OS No. 2,939,965. The use of such a roll jacket, however, is not without problems. The roll jacket must be able to withstand the high temperatures that occur in", "input_text": " recoating elastic calender rolls.\nCellulose fibers, particularly cotton linters, utilized for the filling of elastic calender rolls offer improved technical properties for calendering the papers to be processed which accounts for their widespread employment. However, they cause a number of potential and generally cost-increasing difficulties for operating the calenders. Considerably high temperatures are produced in performing the fulling function at the circumferential region of the rolls, with the considerable line pressures of up to 300 daN/cm which are frequently employed. Considering the relatively poor thermal conductivity of the cellulose material of the cotton fibers, a heat build-up due to non-dissipated thermal energy arises in the roll jackets, the build-up leading to the highest temperatures in a region at about 10 mm below the roll surface. In particular, temperature peaks occur in the area causing superficial damage to the rolls, such damage easily giving rise to tearing of the glazed paper web or permitting the passage of foreign bodies through the roll gaps. The elevated temperatures occur particularly at such locations that the fibrous material of the roll filling actually burns below the surface. As a result, the roll filling loses its specific properties in these regions and generally becomes unusable for further employment. When this occurs, considerable costs are incurred for recoating.\nA number of structural measures have been tried in calenders in order to prevent temperature peaks that lead to roll scorching from occurring. One such measure is the use of internal roll cooling. Considering the poor thermal conductivity of the cellulose material, however, such measures have only a limited effect. The difficulties involved as well as measures that have been tried to eliminate these difficulties are described, for example, by E. Munch and W. Schmitz in the \"Wochenblatt fur Papierfabrikation\" 1980, Number 11/12. In this publication, the expert authors confirm that the technological possibilities"} +{"output_text": ") is controlled to maintain a constant temperature in the furnace. The temperature of the furnace is maintained at a temperature sufficient to react the organoboron precursor with the nitriding agent. The temperature of the furnace is maintained at a temperature sufficient to react the organoboron precursor with the nitriding agent. The temperature of the furnace is maintained at a temperature sufficient to react the organoboron precursor with the nitriding agent. The temperature of the furnace is maintained at a temperature", "input_text": "-carbon-hydrogen composition to react with the nitriding agent in the second heating step to form a boron-nitrogen powder; and collecting the powder.\nFurther steps in this method may be taken as follows:\nThe nitriding agent may comprise NH3, N2/H2, N2, alkylamines, hydrazine, cyanamide, dicyanamide, hydroxylamines, or mixtures thereof. The nitriding agent may comprise a liquid, which is aerosolized and is swept into the furnace by a carrier gas.\nThe organoboron precursor agent may comprise an alkylborate. The alkylborate may comprise a trialkylborate. Further, the trialkylborate may comprise (MeO)3B, (EtO)3B, (PrO)3B, or (BuO)3B. However, the precursor agent may comprise a polyborate. The polyborate may comprise a boroxine. Further, the organoboron precursor may comprise an azeotropic mixture. The azeotropic mixture may comprise an alkylborate and an alcohol. The alkylborate may be trimethylborate and the alcohol may be methanol. The organoboron precursor may be dissolved in simple alcohols, alkanes, or arenes prior to aerosolization, thereby increasing the percentage of carbon in the resulting BNxOyCz powder. When the organoboron precursor is dissolved in alcohols, alkanes, or arenes, the resultant BN compound is microporous or nanoporous. Further, the organoboron precursor may be dissolved in liquid ammonia prior to aerosolization.\nThe aerosolized organoboron precursor and carrier gas, and the nitriding agent are simultaneously swept or injected into the furnace. The flow of the combined gas stream (organoboron precursor and carrier gas"} +{"output_text": " N outputs requires N2/2 edge nodes and N/2 core nodes.\nThe Clos network architecture is shown in FIG. 4. The Clos network architecture is a non-blocking network architecture. The Clos network architecture is a non-blocking network architecture because the Clos network architecture is a non-blocking network architecture because the Clos network architecture is a non-blocking network architecture because the Clos network architecture is a non-blocking network architecture because the Clos network architecture is", "input_text": " having gigabit Ethernet (1 GE), or similar, high speed interfaces 320. The 2\u00d72 router modules 310 are interconnected in a manner that forms a non-blocking 4\u00d74 routing architecture. Different sizes and arrangements of router modules 310 are possible to form different sized router clusters 300. Furthermore, a hierarchy of cluster-based routers 300 can be used to form even larger cluster-based routers. For example, a 16\u00d716 CbR could be created from four of the 4\u00d74 cluster-based routers 300 shown in FIG. 3. General details of this prior art proposal used to be found on the Internet at http://www.stanford.edu/class/ee384y/, but the details are no longer published.\nThe CbR router 300 lacks flexibility in configuring thereof to address specific routing issues, and changes in routing functionality require new hardware or new code development. Moreover, it is apparent that a scalability issue exists as the number of 2\u00d72 router modules 310 increases as O(N2) for an O(N) growth in ports.\nAnother prior art investigation into the feasibility of using a Clos network to implement distributed routing is entitled \u201cCan Google Route?\u201d and was presented by Guido Appenzeller and Mathew Holliman. The Clos network architecture is proposed because such a design is non-blocking.\nAppenzeller and Holliman show a dramatic increase in cost-per-gigabit with total throughput for single unit dedicated routers. Appenzeller and Holliman show that using Clos-network-type router clusters is only more economical than single unit dedicated hardware routers for implementations involving very large numbers of ports. In general Clos networks employ a hierarchy of nodes: edge and core. Edge nodes exchange packets with external communications networks while core nodes do not, which is why, in general, switching N inputs to"} +{"output_text": " can also be used in this context. Generally, the pattern will correspond to a particular functional layer in a device being created in the target portion, such as an integrated circuit or other device (see below). Examples of such a patterning device include: A mask. The concept of a mask is well known in lithography, and it includes mask types such as binary, alternating phase-shift, and attenuated phase-shift, as well as various hybrid mask types. Placement of such a mask in", "input_text": " can be employed or \"speckling\" will occur on the new parts to be coated wherein the old and new colors are intermixed. Although some production coating applications require the use of a single color for long periods of time, many smaller volume operations and customers producing customized articles have a need for frequent color changes on a daily basis. Moreover, the economics of such smaller volume operations require that the different colored coating materials be collected for reuse rather than being thrown away as scrap.\nThis problem has been addressed to some extent in the powder spray booth systems disclosed in U.S. Pat. Nos. 4,498,913 and 4,723,505 mentioned above. In systems of this type, self-contained, cartridge filtering units are moved into position with respect to the interior of a spray booth in preparation for a coating operation involving one color of coating material, and then the filter units are subsequently removed from the booth when the spraying operation for that color is terminated. In order to resume spraying with a new color, the interior of the spray booth is cleaned and a new filter module is moved into position at the spray booth and connected with a fan module permanently associated with the booth.\nWhile systems of this type provide for increased efficiency in effecting a change from one color of coating material to another, they nevertheless require the customer to inventory a separate filter module for each color to be utilized. If a particular business routinely uses many different colors for a particular type or group of articles, the expense of purchasing a large number of separate filter modules, and providing the space to store them, can be prohibitive. The term \u201cpatterning device\u201d as here employed should be broadly interpreted as referring to means that can be used to endow an incoming radiation beam with a patterned cross-section, corresponding to a pattern that is to be created in a target portion of the substrate; the term \u201clight valve\u201d"} +{"output_text": " may be allowed, but an attempt to write or format the active general partition may cause a system reset.\nAccording to a fifth aspect of the present invention there is provided a method of controlling access to and modification of information stored on a storage medium of a computer system, the storage medium being divided into a plurality of non-overlapping partitions including a boot partition and at least one general partition, characterised in that\nat least one of said partitions comprises a Write Many Recoverable (WMR", "input_text": ".\nAlternatively, the information originally held in said cluster may be copied to an inactive partition.\nAccording to a fourth aspect of the present invention there is provided an apparatus for controlling access to and modification of information stored on a storage medium of a computer system, the storage medium being divided into a plurality of non-overlapping partitions including a boot partition and at least one general partition, characterised in that\nat least one of said partitions comprises a Write Many Recoverable (WMR) partition wherein, in use, if a write command is issued to overwrite any information stored in a/the WMR partition prior to undertaking said write command said information is copied and stored elsewhere on the storage medium to be copied back to said WMR partition when requiredxe2x80x94for example upon a system reset.\nPreferably the apparatus further comprises a supervising means (a Supervisor) separate of a central processing unit (CPU) of the computer system for controlling the performance of read, write or format operations stored on the storage medium so as to allow, restrict or prevent such operations depending upon the type of information stored within a sector and the type and status of the partition within which the sector is located wherein, in use, the supervising means causes a reset to be required of the computer system should an attempt be made to perform a prohibited read, write or format operation.\nAccording to any of the foregoing method aspects of the present invention read operations may be allowed on any information in the boot partition, but an attempt to write or format the boot partition may cause a system reset.\nFurther, boot sectors of the storage medium may be considered to be part of the boot partition, irrespective of the position of the starting sector of the boot partition as may be defined by the storage medium operating system.\nAlso, reading of any operating system information sectors or user-generated information sectors in an active general partition"} +{"output_text": ", and the napkin and bib are used for the purpose of preventing dirt on the table surface and the dress during the meal. However, the napkin and the bib are not used for the purpose of preventing dirt on the table surface and the dress during the meal, but are used for the purpose of preventing dirt on the table surface and the dress during the meal, and the napkin and the bib are used for the purpose of preventing dirt on the table surface and the dress during the meal,", "input_text": " Model Registration Nos. 3019803 and 3067791 teach, respectively, a table cover which prevents dirt on the dress even if the food is spilled during the meal.\nUtility Model Registration No. 3019803 teaches a table cover providing four pieces of cloth attached on its four sides by tape fasteners to form four bend portions so that crumbs or spilled liquids, during the meal, can be received in those portions and maintained cleanly on the table surface.\nUtility Model Registration No. 3067791 discloses a table cover that can be used both as a bib and a table cloth by attaching a string on one side of a rectangular vinyl cloth the sane size as the table surface. In use, the table cover is spread on the table to put on the tableware, after the string is tied about a neck of the infant or the old man, and thereafter they can take the meal, without any dirt on their dresses or on the carpet The table surface can be also maintained cleanly to use again, without any sweeping, by merely washing the removed table cover. The prior art clearly teach respectively table covers directed to an infant or to an old deformed person, which can prevent dirt on the table surface and the dress, during the meal.\nAs mentioned above, various products and arts, such as the convenient napkin and bib which can receive spilled food during the meal as well as the table covers, have been provided for the dining market for the production of a pleasant meal, prevention of dirt and reduction of a charge for an attendant of the infant or the old deformed person, and those products to be indispensable fixtures for the dining table, in the future. Particularly a household that uses napkins as a table manner, is recently increasing even if there is no infant or deformed old person.\nMeanwhile, all persons including an adult, the infant and the deformed old person have to take respectively a meal"} +{"output_text": "OLS) No. 2,939,034 are known as the so-called bubble jet recording method.\nThe bubble jet recording method is a recording method in which liquid is jetted from a fine orifice by the action of thermal energy to form a bubble in the liquid, and the bubble is expanded to apply pressure to the liquid to cause the liquid to be discharged from the fine orifice.\nThe bubble jet recording method has the following advantages:\n(1) Since the liquid", "input_text": " present invention may be a control apparatus for controlling an intake air quantity to an engine by varying an intake valve closing timing of the engine which comprises a controller that is configured or programmed:\nto calculate a target air quantity in accordance with an engine operating state,\nto calculate an estimated internal EGR quantity of the engine in accordance with the engine operating state,\nto calculate a target intake valve closing timing in accordance with the target air quantity and the estimated internal EGR quantity, and\nto control an actual intake air quantity to the engine by controlling an actual intake valve closing timing of the engine to achieve the target intake valve closing timing.\nA control apparatus according to one aspect of the invention comprises: means for determining the estimated internal EGR quantity; means for determine the target intake valve closing timing in accordance with at least the estimate internal EGR quantity; and means for controlling the intake air quantity to the engine by controlling an actual intake valve closing timing of the engine to the target intake valve closing timing.\nAccording to the present invention, a control process for varying valve timings of intake and exhaust valves of an engine, comprises: estimating an internal EGR quantity in accordance with an engine operating state; and controlling an intake valve closing timing in accordance with a required intake air quantity and the internal EGR quantity. 1. Field of the Invention\nThis invention relates to a liquid jet recording head which jets liquid and forms flying liquid droplets for recording.\n2. Description of the Prior Art\nInk jet recording methods such as jet recording, have recently drawn attention in that noise occurring during recording is negligible high-speed recording is possible and recording can be accomplished without requiring a special process to fix images on so-called plain paper.\nAmong such the liquid jet recording methods, those disclosed, for example, in Japanese Laid-open Patent Application No. 51837/1979 and German Laid-open Patent Application (D"} +{"output_text": " wires are often soldered to the power source connector part of the CCFL. However, the soldering process is often time consuming and labor intensive. Moreover, the soldering process may cause the wires to be damaged, thereby reducing the reliability of the LCD device.\nIn addition, the wires are often soldered to the power source connector part of the CCFL after the LCD device is assembled. Accordingly, the wires are often exposed to the outside of the LCD device, thereby increasing the possibility of", "input_text": " 12 and an LCD panel 11 arranged between a main support 13, formed of a plastic material, and a top case 20, formed of a metal material. Generally, a guide panel 14, components of the backlight unit 12 (e.g., a reflecting plate 12a, a light-guiding plate 12b, a first diffusing or protecting sheet 12c, a first prism sheet 12d, a second prism sheet 12e, and a second diffusing or protecting sheet 12f), a lower polarizing plate 11b, the LCD panel 11, and an upper polarizing plate 11a, are sequentially stacked on the main support 13.\nBacklights may be classified as direct-type or edge-type depending on their location relative to the LCD panel and the manner in which the light they supply is directed to the LCD panel. For example, direct-type backlights irradiate light directly to a lower side of the LCD panel. Edge-type backlights, however, are arranged in side portions of the main support 13 and irradiate light to the light-guiding plate 12b, wherein light incident the light-guiding plate 12b subsequently becomes uniformly distributed and transmitted to the lower side of the LCD panel 11.\nAs mentioned above, Cold Cathode Fluorescent Lamps (CCFLs) are commonly used as backlights within LCM. Accordingly, a CCFL used within LCD devices is usually coupled to an inverter mounted on a rear of the LCM 10 by wires extending from a side or rear portion of the LCM 10. Typically, the wires are often soldered to a power source connector part of the CCFL. The inverter converts externally provided Direct Current (DC) electricity into Alternating Current (AC) electricity, wherein the AC electricity is used by the CCFL to emit light.\nWhile assembling the LCM 10 and the LCD device, the"} +{"output_text": " separation and delamination.\nIn the manufacture of bias and radial tires, the belt is generally constructed by winding a strip of fabric, usually of rubberized fabric, around a steel wire core. The ends of the belt are then cut to length and the ends are then coated with a rubber adhesive to enhance adhesion to the rubberized fabric. The ends of the belt are then cut to length and the ends are then coated with a rubber adhesive to enhance adhesion to the rubberized fabric. The ends of", "input_text": " Pat. No. 3,803,965 and U.S. Ser. No. 676,903, owned by our comon assignee, The Steelastic Company.\nIrrespective of the apparatus and method employed, i.e., calendering or assembly of strips of reinforced ribbon, when the reinforcing filaments are metallic, it has been recognized that the exposed ends of reinforcement along the cut edges of the reinforced elastomeric fabric cause an adverse effect upon the products, most particularly radial or bias tires, within which they are incorporated. Metallic reinforcing generally employed is steel wire either monofilament or cabled and tire manufacturers have long striven to obtain good adhesion between the elastomer and embedded metallic reinforcement by incorporating certain rubber soluble cobalt containing salts within the elastomer. Although the wire is plated or coated with brass to resist rusting or an adhesive for enhancement of adhesion, where the wire has been severed, an exposed surface is presented which immediately begins to oxidize upon contact with the atmosphere. While adhesion between chemically clean steel and some elastomers may be acceptable, such oxidation is to be avoided inasmuch as the elastomeric material in which the wire is embedded does not adhere to the wire as it becomes oxidized.\nIn the manufacture of steel belted bias and radial tires, one or more circumferentially oriented belts are located beneath the tread stock to maintain the integrity and shape of the tire during inflation and subsequent load. The steel reinforcement in these belts is commonly disposed at an angle from the length of the belt, and subsequently with respect to a plane perpendicular to the rotational axis of the tire, and thus, when the belt is constructed, all of the severed ends of steel reinforcement are exposed along both sides of the belt. In addition to making such belts difficult to handle by the worker, oxidation of these exposed ends before the belt can be incorporated in a tire, gives rise to subsequent belt edge"} +{"output_text": " the channel frequencies. The etalon is mounted in a manner that causes the etalon to be tuned to the channel frequencies.\nIn another example, the system includes a laser controlled to transmit at a frequency selected from a group of channel frequencies having fixed wavelength spacing. An etalon is mounted between the laser and an output optic fiber. The etalon has a free spectral range (FSR) equal to the fixed wavelength spacing and is tuned so that fringes produced by the et", "input_text": " beam from the laser and hence the amplitude of the beam as a function of wavelength is unaffected, as might otherwise occur with the detector positioned off-axis and a beam splitter employed to reflect a portion of the laser beam to the detector.\nAnother concern lies in the possibility of optical frequency chirp affecting signals transmitted along the optic fiber at one of the ITU gridline wavelengths. Briefly, optical chirp occurs when a current source used to modulate the transmission laser causes dynamic changes in the index of refraction of a laser junction or other optical transmitter. A dynamic change in the index of refraction in turn causes a dynamic change in the actual transmission frequency of an optical pulse transmitted into the optic fiber. As a result, the time average of the optical transmission may indicate that a leading edge of an optical pulse has a slightly different frequency than the trailing edge of the pulse. Although the initial frequency differential between the leading and trailing edges of the pulse may be slight, chromatic dispersion inherent in optic fibers causes the leading and trailing edges to propagate at different speeds resulting in potentially significant distortion of the optical pulse, particularly over long haul optic fiber transmission systems. The distortion limits either the maximum frequency of signal transmission modulation or the maximum distance at which signals can be reliably transmitted.\nAccordingly, it was desirable to provide a system for limiting optical frequency chirp, particularly for use within system transmitting on ITU channel wavelengths and other aspects of the invention described in the parent application filed Sep. 28, 2001 were directed to providing just such a system. In one example, the system includes a laser controlled to transmit at a frequency selected from a group of channel frequencies having fixed wavelength spacing. An etalon is mounted between the laser and an output optic fiber. The etalon has a free spectral range (FSR) equal to the fixed wavelength spacing and is tuned so that fringes produced by the etalon are aligned with"} +{"output_text": " for comparing the address field of the frame with a list of address fields corresponding to the current flows of data each time a data frame is to be transmitted by the ingress node over the network to an egress node, and for selecting all the candidate headers that are associated with the flows of data having the same address field, and for determining a compressed header of the frame header for each of the candidate headers based upon the position and the number of bytes that differ between the frame header and the candidate header", "input_text": " being the header of a frame pertaining to a flow of data irrespective of protocol.\nAnother object of the invention is to provide a new system enabling an ingress node to transmit compressed frame headers at high speeds over a data transmission network irrespective of the protocol being used.\nTherefore, the invention relates to a method for transmitting data frames with compressed headers in a multiprotocol data transmission network comprising at least one ingress node and a plurality of switching nodes, wherein the ingress node transmits flows of data to at least one of the switching nodes used as egress nodes, each flow of data being frames that include data bytes and a header that includes information which defines protocols associated with the various layers involved in the flow of data.\nThis method comprises the steps of:\ncomparing the address field of the frame with a list of address fields corresponding to the current flows of data each time a data frame is to be transmitted by the ingress node over the network to an egress node,\nselecting all the candidate headers that are associated with the flows of data having the same address field,\ndetermining a compressed header of the frame header for each of the candidate headers based upon the position and the number of bytes that differ between the frame header and the candidate header,\nselecting as reference header the one amongst the candidate headers for which the compressed header has the best compression ratio, and\ntransmitting a compressed data frame wherein the data bytes of the data frame received by the ingress node are preceded by a reference label associated with the reference header and a compressed header including a field defining the position and the number of consecutive bytes in the portion being compressed which are different from the corresponding bytes of the reference header, a field including the different bytes, and a field including the portion of header not being compressed.\nAccording to another aspect, the invention relates to a system for implementing the above method comprising address lookup means"} +{"output_text": " a portable computer system having a keyboard that is adapted to be coupled to a portable computer system. The portable computer system includes a housing having a keyboard opening. The keyboard opening is adapted to receive the keyboard. The keyboard includes a plurality of keys. The keys are disposed on a surface of the keyboard. The surface of the keyboard is disposed in a plane. The plane is substantially parallel to a plane of the keyboard opening. The keyboard includes a plurality of keys. The keys are disposed on a surface of", "input_text": " would perform the same process for each letter.\nFurther, when that same user needed to input numerical data, the user would switch from the alphabetic keyboard to the numerical keyboard to enter the required numbers, and then switch back to the alphabetic keyboard to continue with the Inputting of alphabetic data. When punctuation such as, e.g., an exclamation point is needed, the user was required to switch to the keyboard that contained the punctuation. While this attempt also provides an effective way for many users to input alphanumeric characters, it can be a somewhat slow process for others. As such, this method is not particularly well suited for a user who desires to enter substantial amounts of data.\nIn another attempt, a miniature keyboard was developed which was adapted into the form factor of the portable computer system. Unfortunately, because this miniature keyboard was so small, as were the keys, it was difficult to input data. Additionally, using any more than one finger at a time was difficult, given the tiny surface on which the keys were disposed. Further, because of the miniature size of the keyboard, a user would be required to contort their wrists and arms into uncomfortable positions to input the data.\nThus, a need exists for an apparatus that is coupled to and provides a convenient and simple way for a user to input data into a portable computer system. An additional need exists for an apparatus that meets the above listed objective and which provides full sized keyboard functionality to a portable computer system. An additional need exists for an apparatus which meets the above listed objectives and which provides for proper ergonomic positioning of the keyboard relative to the user.\nThese and other objects and advantages of the present invention will no doubt become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiments which are illustrated in the various drawing figures.\nThe present invention provides"} +{"output_text": ",582,865 (Garwood et al.), U.S. Pat. No. 4,786,369 (Garwood et al.), U.S. Pat. No. 4,865,845 (Garwood et al.), U.S. Pat. No. 4,935,004 (Garwood et al.), U.S. Pat. No. 5,037,432 (Garwood et al.), U.S. Pat. No.", "input_text": " function and then uses the static type information to identify additional virtual call sites.\nHowever, since CHA relies on knowing the set of classes derived from a particular static class type, it requires knowledge of the complete class hierarchy. Therefore, like the Resolution by Unique Name Method, CHA can either be performed at link-time or at an earlier phase, provided that a program database is available which specifies the complete class hierarchy.\nStill another method known in the art for removing dead code is referred to as \"alias analysis.\" A description of alias analysis can be found in several publications, including K. Cooper et al., \"Fast Interprocedural Alias Analysis\", Conference Record of the Sixteenth ACM Symposium on Principles of Programing, ACMPRESS, January 1989, pp. 49-59. Basically, alias analysis processes a program in a manner that keeps track of every variable identified during compilation, and keeps track of what each variable could possibly point to, and iterates repeatedly over each function call to determine, rigorously, if the call will occur during running of the program. Depending on which particular alias analysis algorithm is used, and which language it is applied to, and other program-specific parameters, this method often requires, in typical cases approximately fifty to one thousand iterative inspections of the entire program and of each function call. Accordingly, alias analysis can be impracticable for many applications. 1. Field of the Invention\nThe present invention relates to colored orthopedic resins and colored orthopedic casting materials which are storage stable.\n2. The Prior Art\nRecently, water curable, isocyanate functional, polyurethane prepolymers have been found to be extremely useful in formulating a resin for orthopedic casting materials, as disclosed, for example, in U.S. Pat. No. 4,502,479 (Garwood et al.), U.S. Pat. No. 4"} +{"output_text": " plasma duct, and the number of the baffles is increased. Therefore, the plasma duct is enlarged, and the plasma duct is increased in weight.\nSecondly, the plasma duct is bent or curved, and the baffles are provided on the inside wall of the plasma duct. Therefore, the plasma duct is increased in length, and the plasma duct is increased in weight.\nThirdly, the plasma duct is bent or curved, and the baffles are provided on the inside wall of the", "input_text": " containing the target material generated by the target 33, a charged particle (ion) advances following the line of magnetic force (deflection magnetic field) 40, and is sent from an exit of the plasma duct 39 into a film forming chamber (not shown), and evaporated to a substrate (not shown) located in the chamber. Non-ionized neutral particles and coarse particles do not reach the substrate but adhere to the plasma duct 39 and the baffles 52 disposed on the inside wall of the plasma duct 39.\nTherefore, major parts of coarse particles are, as advancing within the plasma duct 39, removed by adhering to the inside wall of the plasma duct 39 and the baffles 52.\nU.S. Pat. No. 5,279,723 and Japanese Patent Unexamined Publication No. Hei. 10-280135 disclose similar devices, each of which is provided, for catching coarse particles, with the baffles, which is composed of a metal ring like member or a vane like member extending toward the inside from the inside wall of the plasma duct, on the inside wall of the bent or curved plasma duct.\nHowever, problems as mentioned under are involved with the structure of the baffles located on the inside wall of the conventional plasma duct.\nFirstly, as mentioned above, each of the baffles is composed of the metal ring like member or the vane like member extending toward the inside from the inside wall of the plasma duct. Therefore, when coarse particles come flying to parts facing vertically to the plasma duct, they adheres thereto or are reflected in the cathode direction, and when those come flying to a parallel face to the plasma duct, they are reflected with high possibility in the direction of the substrate.\nIf many ring or vane like members are provided for repulsing much coarse particles in the target direction, there are inevitably many faces parallel to the"} +{"output_text": " than optimum surge protection, since the surge arrestor is not in parallel with the fuse-transformer combination. The surge arrestor is not in parallel with the fuse-transformer combination because the surge arrestor is not designed to carry the full surge current created during a lightning strike. The surge arrestor is designed to carry the surge current to ground, and the fuse-transformer combination is designed to carry the surge current to ground. The surge arrestor is not designed to carry the surge current", "input_text": " by the arrestor, the resistance of the arrester drops precipitously to a level substantially lower than that of the transformer, and the surge arrester diverts the surge to ground. Once the surge has passed or dissipated, the arrester once again returns to the high resistance state to re-energize the transformer.\nFuses are used to further protect the transformers located on poles or at remote locations in the distribution network. Power lines coming into the transformer are typically rated at between one half and 100 amps, supporting voltages of between 2400 and 38000 volts thereon. The fuses are commonly placed in a parallel electrical relation to the surge arrestor. The fuse, and transformer, are mounted in series and these two components in series are then mounted in an electrically parallel configuration to the surge arrestor. The arrestor provides protection to the transformer by diverting surges, such as those caused by lightning, to ground, rather than through the fuse and transformer combination and into the rest of the distribution network. The fuse provides long-term overload protection to the circuit, such as what occurs when a short appears as a result of a failure in the transformer or a long-term over load situation is present in the secondary circuit. However, the fuse is not intended to carry the lightning surge to ground. The prior art fuses cannot withstand the full surge current created during a lightning strike, and thus the surge arrestor must be placed in parallel with the series combination of the transformer and fuse to protect both the transformer and the fuse.\nTo physically locate the surge arrestor and fuse-transformer combination in a parallel electrical configuration, the surge arrestor and fuse are both placed upon the pole, pad, or other mounting location, or otherwise remotely located from the transformer tank, and the ground lead is run from the transformer tank to the surge arrestor. This arrangement leads to less"} +{"output_text": " fan is not engaged, the fan will not be turning and the engine will not be running. The fan is not turning because the clutch is not disengaged. The clutch is not disengaged because the fan is constantly engaged.\nThe present invention is directed to a magnetic clutch for transferring power from a drive shaft to a driven shaft. The clutch includes a housing having a first end and a second end. A first magnet is mounted on the first end of the housing. A second magnet is mounted on", "input_text": " engines hotter using the same cooling equipment. The magnetic fan disclosed in the aforesaid patent application can be used with these hotter engines because it transfers 100% of the power using a magnetic field and does not have the 7-10% inefficiencies due to the viscosity of the viscous fluid.\nIn all of these fans and fan clutch systems, removal of heat is a significant problem for the bearings and for the fan belts. The failure of bearings or the breaking of fan belts are a considerable cause of maintenance or down time. By keeping the bearings and fan belt temperature lower, the life of the belts and bearings can be improved considerably. These are important considerations with respect to the maintenance and the overall down time with respect thereto caused by bearing or belt failures. The replacement of a bearing on a fan clutch is a significant maintenance problem for a truck or a piece of equipment. Thus, there is a need for a new and improved magnetic clutch that overcomes the deficiencies of existing ON/OFF frictional face clutches and existing viscous fluid clutches.\nSome vehicles, principally in Europe, have a fan clutch which is mounted on the crankshaft and which is always partially engaged to transfer a certain amount of power to the fan. For example, at least 40% of the power to turn the fan to full speed needed and up to 90% of the power to turn the fan at full speed. This clutch never transfers 100% of the power, and this clutch is never totally disengaged such that the fan can be free-wheeling. This particular type of clutch also has a vibration isolator which to some extent serves to dampen or isolate vibrations at the fan clutch from the crankshaft vibrations. If the vibration isolator wears out allowing vibration to be transmitted, the clutch will wear out in only a few days.\nSuch a constantly engaged fan wastes fuel. If the"} +{"output_text": " light-source 5 should aim at a point on the imaging plate 3 that is 30 units north and 30 units east. To ensure that the imaging plate 3 is at the appropriate location, the display driver 6 also has to specify the values of the times t1, t2 and t3 based on how fast the imaging plate 3 is spinning.\nThe display driver 6 could also draw the line 8 in FIG. 2 in a polar coordinate system by specifying, for example, that at time t1", "input_text": " axis 4. In FIG. 2, the imaging plate 3 is shown in six successive instants as it rotates around the axis 4, now perpendicular to the page. At each of the six instants, the light source 5, under the control of the display driver 6, illuminates a spot 7 on the imaging plate 3. As shown in FIG. 2, by aiming the light source 5 at the correct spot and firing the light source 5 at the right time, it is possible to trace out the line 8. It is the function of the display driver 6 to correctly aim and fire the light source 5 so as to trace out the line 8.\nTo aim the light source 5, the display driver 6 needs a way to identify points in space. In other words, the display driver 6 needs a coordinate system. One possible coordinate system is a Cartesian coordinate system.\nUsing a Cartesian coordinate system, the display driver 6 would draw the line 8 in FIG. 2 by specifying, for example, that at time t1 the light-source 5 should aim 30 units north, at time t2, the light-source 5 should aim 29 units north and 1 unit east, at time t3, the light-source 5 should aim 28 units north and 2 units east, and so on. To ensure that the imaging plate 3 is at the appropriate location, the display driver 6 also has to specify the values of the times t1, t2 and t3 based on how fast the imaging plate 3 is spinning. Although the spinning of the imaging plate 3 can be resolved into a north-south component and an east-west component, this is a computationally taxing exercise that can easily be avoided by using a polar coordinate system.\nAs an alternative, the display driver 6 could draw the line in FIG. 2 in polar coordinates by specifying, for example, that at time t1, the"} +{"output_text": " header, and reference numeral 3b means an outlet header. Reference numeral 4a means a heat transfer tube, and reference numeral 4b means a branch pipe. Reference numeral 5a means a heat transfer tube, and reference numeral 5b means a branch pipe. Reference numeral 6a means a heat transfer tube, and reference numeral 6b means a branch pipe. Reference numeral 7a means a heat transfer tube, and reference numeral 7b means a branch pipe", "input_text": " conductivity.\nFIG. 66 shows another conventional embodiment, i.e., a plate fin-type heat exchanger used for a room air conditioner and so forth. For assembly of the heat exchanger, instead of the small-gage wires 2 serving as a fin, plate-type fins are mounted at interval of about 1 to 5 mm. Further, a heat transfer tube 1 is inserted into a hole provided in the fin, and after the insertion, fluid is introduced into the heat transfer tube 1 with pressure. Thereby, a diameter of the heat transfer tube 1 is expanded to bring the heat transfer tube 1 into tight contact with the plate fin 102.\nIn the plate fin-type heat exchanger, the out-tube operating fluid A can flow along the plate fin without large turbulence, resulting in reduced thermal conductivity.\nIn recent years, a diameter of the heat transfer tube has been decreased in order to provide a more compact and higher-performance heat exchanger. However, when the narrow heat transfer tube is applied to the heat exchanger (in particular, an evaporator), a higher pressure loss is caused in a coolant flowing in the tube, resulting in a reduced performance of an air conditioner. Hence, in a typical method, the number of path of the heat exchanger is increased to decrease an amount of circulating coolant per path, thereby avoiding the reduction of performance.\nTypically, a branch pipe may be used for several paths. For several tens to several hundreds paths, in many cases, an inlet header and an outlet header are mounted, and a plurality of heat transfer tubes are disposed between the headers so as to provide a multi-path heat exchanger (evaporator).\nFIG. 67 is a sectional view of a conventional multi-path evaporator disclosed in Japanese Patent Publication (Kokai) No. 6-26737. In the drawing, reference numeral 3a means an inlet"} +{"output_text": " of the GPS receiver can be affected by the position of the GPS satellites, the position of the GPS receiver, the position of the GPS antenna, the position of the GPS receiver antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna, the position of the GPS antenna,", "input_text": " regulations in areas such as grade crossings must be complied with.\nTrains or a maintenance crews must be coordinated by a dispatcher to occupy a portion of main line track between named locations (e.g., mile markers, switches, stations, or other points). In addition to specifying certain track sections, dispatchers must be able to coordinate trains and crews with respect to specifying speed limits, direction, time limits, and whether to clear the main line (e.g., by entering a secondary track such as a siding) and/or any other section of track (sidings, yards secondary track, etc.). Any errors in this process can lead to disastrous consequences.\nAttempts to automate the above-described track coordination system include Centralized Traffic Control (CTC) systems which allow a dispatcher to control movement of trains by controlling track switches and wayside signals from a central dispatch office. More advanced systems include Automatic Train Control (ATC) systems where train location, speed and train control information are continually exchanged between a train cab and computerized wayside controllers in real time (in some systems, often referred to as cab signal systems, track rails are used to carry this information). The more advanced versions of CTC and ATC systems often employ GPS technology for accurate positioning information for speed, reliability and safety reasons.\nGiven the foregoing, one can conclude that the accuracy of any particular standalone GPS receiver (e.g., located on a train car), or collection of GPS receivers (e.g., several receivers working as part of a CTC or ATC system) is of concern. Any given GPS receiver can have an accuracy (i.e., can have errors in their positioning determination) ranging from 10 to 100 meters. The accuracy of a GPS receiver is affected by several different factors that can be categorized as either \u201cnatural\u201d or \u201cmilitary.\u201d\nAs for the natural category of errors, the position"} +{"output_text": " characterized by a limited bandwidth, and the data rate of the data services is limited by the bandwidth of the radio channel. The data rate of the data services is also limited by the data rate of the radio channel. The data rate of the data services is also limited by the data rate of the radio channel. The data rate of the data services is also limited by the data rate of the radio channel. The data rate of the data services is also limited by the data rate of the radio channel.", "input_text": " other instances, the buildup is washed out of the tube by action of the pouring stream and into the mold where it can rupture the solidifying shell causing molten steel to \"break-out\" and necessitating casting of that strand to be terminated. Experimental data shows that the larger diameter tubes require the seal between the shroud tube and the tundish to be as tight as possible to prevent air leakage into the shroud tube from the surrounding atmosphere.\nJapanese researchers have determined that the oxygen content of shrouding gas must be maintained at less than 0.8% to prevent the continuous formation of oxide inclusions from reoxidation of the steel stream. In our experimental work, which involved the accurate measurement of shroud and mold environment oxygen concentrations, we determined: first, that a shroud sealed to the tundish has a significantly lower oxygen concentration in the shroud than one with a gap between the top of the shroud and the bottom of the tundish; and second, that as the gap between the bottom of the shroud and the top of the mold is decreased, the oxygen concentration in both the shroud and in the mold decreases significantly.\nTherefore, the shroud tube should extend as far as possible downwardly toward the mold, yet allow space between the shroud and mold for viewing the liquid level in the mold. Heretofore, there has been no convenient mechanism for placing a shroud against a tundish and for removing it when necessary in order to divert the pouring stream from the mold. The invented shroud apparatus is readily positionable tightly against the pouring nozzle of a molten metal pouring stream from a bottom-pour vessel, yet is easily and quickly removed to accomodate other apparatus such as a launder beneath the stream. There has been a growth in demand for packet switched wireless data services due to the growth in internet applications. A typical channel over which these data services are delivered is a radio channel. Radio channels are"} +{"output_text": " substrates is not well controlled. For example, the growth of nanotubes on a silicon substrate is not well controlled. The growth of nanotubes on a silicon substrate is usually carried out by CVD using a metal catalyst. The main roles of the catalyst are to break bonds in the carbon carrying species and to absorb carbon at its surface and to reform graphitic planes through diffusion of carbon through or around an interface. However, the growth of nanotubes is usually carried out on silicon or other semiconducting substrates.", "input_text": " on n-channel semiconductor (NMOS) circuitry.)\nThere are specific problems related to control of the properties of grown materials. Even though numerous different alternative growth methods exist for growing carbon nanostructures, controlling the interface properties between the nanostructures and the substrates, the body of the nanostructures, and the tip of the nanostructures are not yet demonstrated to be well controlled by utilizing a single method of growth.\nCVD typically employs a metal catalyst to facilitate carbon nanostructure growth. The main roles of the catalyst are to break bonds in the carbon carrying species and to absorb carbon at its surface and to reform graphitic planes through diffusion of carbon through or around an interface (see, e.g., Kim, M. S.; Rodriguez, N. M.; Baker, R. T. K., Journal of Catalysis, 131, (1), 60, (1991); and Melechko, A. V.; Merkulov, V. I.; McKnight, T. E.; Guillorn, M. A.; Klein, K. L.; Lowndes, D. H.; Simpson, M. L., J. App. Phys., 97(4), 41301, (2005), incorporated herein by reference).\nHowever, the growth of nanotubes is usually carried out on silicon or other semiconducting substrates. Growth from such metal catalysts on conducting metal substrates or metal underlayers is almost lacking. This is because it has been found that it is hard to make a good contact between a growing nanostructure and a conducting substrate with good quality grown nanostructures in terms of control over diameter, length and morphology. Nevertheless, for making CMOS-compatible structures, it is necessary to use a conducting substrate. In particular, this is because a metal substrate, or base layer, acts as bottom electrode for electrical connection to the nanostructures.\nNevertheless, growth of nanostructures on CMOS compatible"} +{"output_text": " stacking direction, the first surface is made more than the second surface in the thickness direction of the first semiconductor element as seen in the stacking direction, and the second surface is made more than the first surface in the thickness direction of the second semiconductor element as seen in the stacking direction, the stress on the first semiconductor element increases in accordance with an increase of the rigidity against bending of the first semiconductor element, and the stress on the second semiconductor element decreases in accordance with the decrease of the thickness thereof on the", "input_text": " the stacking direction is made more than the connecting rigidity between the second semiconductor element and the substrate through the second synthetic resin at the outside of the second semiconductor element as seen in the stacking direction without the synthetic resin whose Young\"\"s modulus is not less than Young\"\"s modulus of the first synthetic resin at the inside of the second semiconductor element as seen in the stacking direction so that the connecting rigidity between the second semiconductor element and the substrate is substantially formed by a bending low rigidity of the second synthetic resin, the stress on the semiconductor element increases in accordance with the increase of the connecting rigidity between the semiconductor element and the substrate, and the stress on the semiconductor element decreases in accordance with the decrease of the thickness thereof on the substrate, the crack or excessive stress on the first semiconductor element whose connecting rigidity is made more than the connecting rigidity between the second semiconductor element and the substrate is effectively prevented.\nIf the thickness of the second semiconductor element is smaller than the thickness of the first semiconductor element when a main component of the first semiconductor element is Si, a main component of the second semiconductor element is GaAs, since Young\"\"s modulus of GaAs is more than Young\"\"s modulus of Si, a rigidity against bending of the semiconductor element increases in accordance with an increase in Young\"\"s modulus of the semiconductor element, the stress on the semiconductor element increases in accordance with an increase of the rigidity against bending of the semiconductor element, and the stress on the semiconductor element decreases in accordance with the decrease in thickness thereof on the substrate, the crack or excessive stress on the second semiconductor element whose rigidity against bending is made more than the rigidity against bending of the first semiconductor element by a difference in Young\"\"s modulus between the first and second semiconductor elements is effectively prevented.\nWhen each of the first and second semiconductor elements includes a first surface facing to the substrate and a second surface as a reverse surface with respect to the first surface in the"} +{"output_text": " produce an injectable preparation.\nThe compounds of the invention can be prepared by the following reaction scheme: \nwherein R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12, R13, R14, R15, R16, R17, R18, R19, R20, R21, R22, R23, R24, R25,", "input_text": " being preferred.\nThe antiplatelet drugs may be employed in amounts as indicated in the PDR. Ifetroban may be employed in amounts as set out in U.S. Pat. No. 5,100,889.\nAntiosteoporosis agents suitable for use herein in combination with the compounds of formula I of the invention include parathyroid hormone or bisphosphonates, such as MK-217 (alendronate) (Fosamax(copyright)). Dosages employed will be as set out in the PDR.\nIn carrying our the method of the invention, a pharmaceutical composition will be employed containing the compounds of structure I, with or without another therapeutic agent, in association with a pharmaceutical vehicle or diluent. The pharmaceutical composition can be formulated employing conventional solid or liquid vehicles or diluents and pharmaceutical additives of a type appropriate to the mode of desired administration. The compounds can be administered to mammalian species including humans, monkeys, dogs, etc. by an oral route, for example, in the form of tablets, capsules, granules or powders, or they can be administered by a parenteral route in the form of injectable preparations. The dose for adults is preferably between 50 and 2,000 mg per day, which can be administered in a single dose or in the form of individual doses from 1-4 times per day.\nA typical capsule for oral administration contains compounds of structure I (250 mg), lactose (75 mg) and magnesium stearate (15 mg). The mixture is passed through a 60 mesh sieve and packed into a No. 1 gelatin capsule.\nA typical injectable preparation is produced by aseptically placing 250 mg of compounds of structure I into a vial, aseptically freeze-drying and sealing. For use, the contents of the vial are mixed with 2 mL of physiological saline, to"} +{"output_text": " applied to the turnup portion is large, which are advantageous in the fatigue resistance of the cord in the carcass.\nThe inventor has made studies with respect to the fatigue resistance of the tire and found that the fatigue resistance of the tire can be improved by controlling the dynamic modulus of the polyester cord used in the carcass ply lower at a low tension and higher at a high tension as compared with that of the conventional PET cord. Furthermore, the inventor has made studies with respect to the tire durability", "input_text": " radial tire simultaneously improving the steering stability and the ride comfort against vibration by using polyester cords having specified properties in the carcass of the tire without sacrificing the durability.\nThe inventor has made various studies with respect to the steering stability and ride comfort against vibration of the tire when the polyester cords having specified properties are used in the carcass of the radial tire and found that the steering stability and the ride comfort against vibration can simultaneously be improved by controlling dynamic stiffness of the tire. That is, it has been confirmed that tension applied to the carcass is high in the sidewall portion and low in the tread portion owing to the presence of the belt even when the internal pressure or the structure of the tire is same. As a result, the invention has been accomplished by masking the dynamic modulus of the polyester cord used in a carcass ply lower at a low tension and higher at a high tension as compared with that of the conventional PET cord. Furthermore, the inventor has made studies with respect to the fatigue resistance and the tire durability and found the following facts:\n(a) In general, the organic fiber cord hardly causes fatigue at tensile strain but causes fatigue at compression strain;\n(b) The compression strain is applied to a turnup portion of the carcass ply wound around a bead core from an inside of the tire toward an outside thereof;\n(c) The compression strain in the turnup portion becomes smaller as the bending of the bead portion becomes small when a load is applied to the tire;\n(d) In case of a tire having a small aspect ratio, the stiffness is high and the bending of the bead portion is small, so that strain applied to the turnup portion is small, which are advantageous in the fatigue resistance of the cord in the carcass; and\n(e) As the stiffness of the bead portion becomes high, the deformation of the sidewall portion is gentle and strain"} +{"output_text": " mold frame 730 is open. The display unit 710 and the back light assembly 720 are fixed and supported by the mold frame 730.\nThe mold frame 730 is provided with a plurality of screw holes 732 for fixing the display unit 710 and the back light assembly 720. The display unit 710 and the back light assembly 720 are fixed and supported by the mold frame 730 by inserting the screws into the screw holes 732.\nThe mold frame 730 is provided with a plurality of", "input_text": " signals to the liquid crystal display panel 712 and a gate portion for providing the gate driving signals to the gate line of the liquid crystal display panel 712. Namely, the integrated printed circuit board 714 generates the gate driving signals for driving the liquid crystal display device, the data signals, and a plurality of timing signals for applying the signals at a proper time. The gate signals are applied to the gate line of the liquid crystal display panel 712 through the gate side flexible circuit board 718, and the data signals are applied to the data line of the liquid crystal display panel 712 through the data tape carrier package 716.\nA back light assembly 720 that provides a uniform light to the display unit 710 is located under the display unit 710. The back light assembly 720 has a linear lamp 722 on one side of the liquid crystal display module 700 to provide the light. A light guide plate 724 has a size corresponding to the liquid crystal display panel 712 of the display unit 710, and is located under the liquid crystal display panel 712. The lamp side of the light guide plate 724 is the thickest. The thickness gradually decreases as goes away from the lamp 722. The light guide plate 724 guides the light generated in the lamp 722 towards the display unit 710, and changes the passage of the light.\nA plurality of optical sheets 726 that spread and intensify the light and pass it towards the liquid crystal display panel 712 are provided above the light guide plate 724. A reflection plate 728 provided under the light guide plate 724 reflects the light leaking from the light guide plate 724 and promotes the efficient use of the light.\nThe display unit 710 and the back light assembly 720 is fixed and supported by a mold frame 730 which is a receiving receptacle. The mold frame 730 has a box-shape, and the upper surface of the"} +{"output_text": " described in a number of U.S. patents, including U.S. Pat. No. 4,873,664, issued Oct. 10, 1989, to Miller et al., entitled \"Ferroelectric Memory Cell and Memory Array\"; U.S. Pat. No. 4,888,789, issued Dec. 19, 1989, to Mukherjee et al., entitled \"Ferroelectric Memory Cell\"; U.S. Pat. No. 4,912,631,", "input_text": " those memories employing an array of one-transistor, one-capacitor (\"1T/1C\") ferroelectric memory cells.\nThis application is related to the following applications assigned to the assignee of the present invention, which are all hereby specifically incorporated by this reference:\nSer. No. 08/970452, entitled \"REFERENCE CELL FOR A 1T/1C FERROELECTRIC MEMORY\";\nSer. No. 08/97020, entitled \"MEMORY CELL CONFIGURATION FOR A FERROELECTRIC MEMORY\";\nSer. No. 08/970543, entitled \"SENSING METHODOLOGY FOR A 1T/1C FERROELECTRIC MEMORY\";\nSer. No. 08/970454, entitled \"SENSE AMPLIFIER CONFIGURATION FOR A 1T/1C FERROELECTRIC MEMORY\";\nSerial No. 08/970454, entitled \"COLUMN DECODER CONFIGURATION FOR A 1T/1C FERROELECTRIC MEMORY\";\nSer. No. 08/970821, entitled \"SENSE AMPLIFIER LATCH DRIVER CIRCUIT FOR A 1T/1C FERROELECTRIC MEMORY\";\nSer. No. 08/970522, entitled \"PLATE LINE DRIVER CIRCUIT FOR A 1T/1C FERROELECTRIC MEMORY\"; and\nSer. No. 08/970448, entitled \"PLATE LINE SEGMENTATION IN A 1T/1C FERROELECTRIC MEMORY\".\n2. Description of the Prior Art\nThe first designs with ferroelectric capacitors utilized memory cells containing two transistors and two ferroelectric capacitors, (\"2T/2C\"). Ferroelectric 2T/2C memory products are"} +{"output_text": " particular, from the group consisting of F, Cl, Br, and Si.\nThe heteroatom containing malonate can be selected from the group consisting of:\n(i) esters of malonic acid with alcohols of formula (II): \nwherein R5 and R6 are independently selected from H, C1-C20 linear or branched alkyl, alkenyl, cycloalkyl, aryl, arylalkyl or alkylaryl group and said R5 and R6 can also be", "input_text": " tertiary carbons and having 3-20 carbon atoms. Although an improvement in terms of yields and isotactic index over the previously cited documents is obtained, the results are still not satisfactory for an economical use of the catalyst components disclosed therein.\nIt has now surprisingly been found that the polymerization yields and the isotactic index of the polymer can be improved by using catalyst components comprising heteroatom containing malonates as internal donors.\nIt is therefore an object of the present invention to provide a solid catalyst component for the polymerization of olefins CH2xe2x95x90CHR in which R is hydrogen or a hydrocarbon radical with 1-12 carbon atoms, comprising Mg, Ti, halogen and an heteroatom containing malonate.\nThe term heteroatom means any atom, different from C and H, in addition to the oxygen atoms deriving from the malonic acid.\nIn particular, the electron donor compounds can be selected from esters of malonic acids of formula (I): \nwherein R1 and R2 equal to or different from each other, are H or a C1-C20 linear or branched alkyl, alkenyl, cycloalkyl, aryl, arylalkyl or alkylaryl group and said R1 and R2 can also be joined to form a cyclic group; R3 and R4 are independently selected from C1-C20 linear or branched alkyl, alkenyl, cycloalkyl, aryl, arylalkyl or alkylaryl group and R3 and R4 can also be joined to form a cyclic group; with the proviso that at least one of the R1 to R4 groups contains at least one heteroatom selected from the group consisting of halogens, N, O, Si, Ge, P, and S.\nThe heteroatoms, are preferably selected from the group consisting of F, Cl, Br, and Si, and, in"} +{"output_text": " example, in a 512.times.512 panel, the capacitance of the address circuits must be increased by a factor of about 4.5.\nIn addition, the proposed technique requires that the address circuits be capable of driving the panel at a high speed. For example, in a 512.times.512 panel, the address circuits must be capable of driving the panel at a speed of about 1 MHz.\nIn addition, the proposed technique requires that the address circuits be capable of driving the panel", "input_text": " may be made for instance to the following published articles: (1) \"Discharge-Logic Drive Schemes\", by J. D. Schermerhorn and J. W. V. Miller, IEEE Transactions On Electron Devices, Vol. ED - 22, No. 9, September 1975, pages 669-673; (2) \"Coupled-Matrix, Threshold- Logic AC Plasma Display Panel\", T. N. Criscimagna, J. R. Beidl, M. Steinmetz and J. Hevesi, Proceeding of the SID, Vol. 17/4, 4th Quarter 1976, pages 176-179.\nIn such proposed gas discharge logic addressing techniques, each discharge point or display pixel on the plasma panel is provided with two row (X) electrodes and two column (Y) electrodes. A particular pixel is selected for display only if suitable signals are provided on all four input electrodes, and thus the pixel can be considered a four input AND gate. The input panel electrodes for each pixel are grouped such that for a 512.times.512 panel only 48 circuit drivers for the row electrodes and 48 circuit drivers for the column electrodes, or a total of 96 circuit drivers are required. In the addressing configuration, each row and column axis may contain groups of 32 electrodes connected in parallel to one driver circuit and groups of 16 electrodes also connected in parallel to a single driver circuit.\nSuch a proposed technique has led to significant problems. First, the electrodes are grouped together and connected to conductor buses in a way that requires electrical crossovers in the panel. The crossovers must be of low capacitance and must withstand voltage breakdown due to the address pulses, which renders manufacturing of a suitable plasma panel significantly more complex. Another major problem is the significant increase in capacitance which the address circuits must drive compared to a conventional plasma panel. For"} +{"output_text": " properties of the silicon carbide.\nThe present invention provides a method for growing a silicon carbide crystal. The method includes providing a silicon carbide seed crystal, and growing a silicon carbide crystal on the seed crystal. The method further includes providing a silicon carbide seed crystal having a silicon carbide seed crystal surface, and growing a silicon carbide crystal on the seed crystal surface. The method further includes providing a silicon carbide seed crystal having a silicon carbide seed crystal surface, and growing", "input_text": " produces the best initial nucleation on the seed crystal may be different from the composition that produces the best bulk growth (and vice versa). Thus, in the conventional physical vapor transport (sublimation) systems, neither nucleation nor bulk growth may be optimized. Instead, both may be compromised based upon the fixed starting material.\nStated differently, in conventional physical vapor transport growth techniques, the relevant system is loaded with a starting material and then heated to drive the sublimation growth of the resulting crystal. The application of heat, however, is typically the only step that can be manipulated during the growth process; i.e., the starting materials are locked in and cannot be modified as growth proceeds.\nOther problems exist. For example, where nitrogen is used as a dopant to create n-type material, the normal and expected process is for the nitrogen dopant atoms to replace carbon atoms in the crystal structure. Changing the ratio of silicon-to-carbon, however, causes the nitrogen to compete with a different amount of carbon for a given position in the growing crystal. This, among other factors, can result in intrinsic defects such as silicon vacancies and carbon vacancies. Furthermore, it is generally expected that the formation (or prevention) of micropipes is affected by the silicon to carbon ratio in the vapor phase.\nAdditionally, these growth issues are of greater concern as the diameter of the growing crystal increases. In this regard, in a commercial context growing larger diameter crystals is usually more efficient than growing smaller diameter crystals. In silicon-based technology, wafers as large as eight inches (200 millimeters) in diameter are commercially available and widely understood. In silicon carbide technology, however, three and four inch wafers (75-100 mm) still remain as a commercial upper limit.\nAccordingly, interest continues to exist in improving the techniques for bulk growth of silicon carbide end in correspondingly improving the resulting bulk"} +{"output_text": " for the specific print engine.\nThe invention also relates to a system for producing documents at a first site from database information produced at a second site remote from the first site. The system may comprise the following components: A first computer remote from the second site containing an object association table which associates document production jobs with specific documents and appropriate object descriptions. A specific print engine at the first site for imaging documents, and electronically controlled by a specific variable print image stream. A second computer at the second site capable", "input_text": " typically practiced so that the database information is supplied to and translated at the first site. The engine specific print stream typically has all variable information for control of the print engine, BI coded and selectable criteria codes.\nStep (b) may be practiced by document sorting according to a predetermined delivery mechanism, providing document references for all documents to be produced at the first site using the object association table, and adding variable data to the documents. Steps (c) and (d) may be practiced to image on the fly directly from the data source the character data for the print engine, typically only the font data being pre-rasterized, or an XLC system may pre-rasterize only the font and character data; with all variable data being provided from steps (b) and (c) in the engine specific print image stream, so that the print engine can print with substantially no limitations related to the number of different text combinations.\nA plurality of different specific print engines may be provided at the first site. Step (c) is then practiced to build a different engine specific print stream depending upon which print engine is utilized.\nThe invention also relates to a system for producing documents at a first site from database information produced at a second site remote from the first site. The system may comprise the following components: A first computer remote from the second site containing an object association table which associates document production jobs with specific documents and appropriate object descriptions. A specific print engine at the first site for imaging documents, and electronically controlled by a specific variable print image stream. A second computer at the second site capable of using the object association table to produce database information containing specific file names. A third computer remote from the second site for using the database information supplied by the second computer and a job formatting table contained within the third computer for translating the database information containing specific file names from the second computer to produce a print image stream"} +{"output_text": " Crkvenjakov, 244 Science, pp. 1645, 1989). This process has been applied to the sequencing of both small genomes (e.g., Haemophilus influenzae) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992) and large genomes (e.g., E. coli) (see Strezoska et al.,", "input_text": " of a cell (\u02dc20-50 \u03bcm2) or a nucleus (\u02dc10 \u03bcm2) at a relatively high local concentration. Furthermore, the probe/target hybridization signal is confined to a microscopic and morphologically distinct area; this makes it easier to distinguish a positive signal from artificial or non-specific signals than hybridization on a solid support.\nMimicking the in-situ hybridization in some aspects, new techniques are being developed for carrying out multiple sample nucleic acid hybridization analysis on micro-formatted multiplex or matrix devices (e.g., DNA chips) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). These methods usually attach specific DNA sequences to very small specific areas of a solid support, such as micro-wells of a DNA chip. These hybridization formats are micro-scale versions of the conventional \u201creverse dot blot\u201d and \u201csandwich\u201d hybridization systems.\nThe micro-formatted hybridization can be used to carry out \u201csequencing by hybridization\u201d (SBH) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). SBH makes use of all possible n-nucleotide oligomers (n-mers) to identify n-mers in an unknown DNA sample, which are subsequently aligned by algorithm analysis to produce the DNA sequence (R. Drmanac and R. Crkvenjakov, Yugoslav Patent Application #570/87, 1987; R. Drmanac et al., 4 Genomics, 114, 1989; Strezoska et al., 88 Proc. Natl. Acad. Sci. USA 10089, 1991; and R. Drmanac and R."} +{"output_text": " however, been shown that the sTNF-Rs can also have a detrimental effect, by inducing the production of other cytokines, such as IL-1 and IL-6, which are involved in the inflammatory response.\nThe TNF-Rs are type I transmembrane proteins, consisting of a short extracellular domain, a transmembrane domain, and a long cytoplasmic domain. The extracellular domain of the TNF-Rs is composed of three subdomains, A, B and C, which are connected by two loops, L", "input_text": "TNF.alpha. has been shown to be involved in several diseases, examples of which are adult respiratory distress syndrome, pulmonary fibrosis, malaria, infectious hepatitis, tuberculosis, inflammatory bowel disease, septic shock, AIDS, graft-versus host reaction, autoimmune diseases, such as rheumatoid arthritis, multiple sclerosis and juvenile diabetes, and skin delayed type hypersensitivity disorders.\nEvidence that some of the effects of TNF.alpha. can be detrimental to the host have attracted attention to the mechanisms that regulate TNF.alpha. function. The intracellular signals for the response to TNF.alpha. are provided by cell surface receptors (herein after TNF-R), of two distinct molecular species, to which TNF.alpha. binds at high affinity.\nThe cell surface TNF-Rs are expressed in almost all cells of the body. The various effects of TNF.alpha., the cytotoxic, growth-promoting and others, are all signalled by the TNF receptors upon the binding of TNF.alpha. to them. Two forms of these receptors, which differ in molecular size, 55 and 75 kilodaltons, have been described, and will be called herein p55 and p75 TNF-R, respectively. It should be noted, however, that there exist publications which refer to these receptors also as p60 and p80 TNF-R.\nBoth receptors for TNF.alpha. exist not only in cell-bound, but also in soluble forms, consisting of the cleaved extracellular domains of the intact receptors, derived by proteolytic cleavage from the cell surface forms. These soluble TNF.alpha. receptors (sTNF-Rs) can maintain the ability to bind TNF.alpha. and thus compete for TNF.alpha. with the cell surface receptors and thus block TNF.alpha. activity. The sTNF-Rs thus function as physiological attenuators of the activity of TNF.alpha., safeguarding against its potentially harmful effects. It has,"} +{"output_text": " of the circuit. This communication is typically accomplished by using a transformer or capacitive divider circuit. However, these circuits are not available on the secondary side of the circuit because they are not available on the isolated secondary side of the circuit.\nAccordingly, there is a need for a simple, low cost, brownout detection circuit that can be used with a switch mode power converter.\nThe present invention relates to a method for producing a semiconductor device, and more particularly to a method for producing a", "input_text": " voltage and frequency (eg. 220V 50 Hz) AC input requirements can be met without having to specify different transformer types or capacitor values. Second, there is a reduction in the likelihood of field wiring mistakes that can occur when an electrician selects the wrong tap to power the system. Third, using a switch mode power converter instead of a transformer or capacitive divider circuit may allow the size of the emergency lighting system to be reduced.\nThe conventional methods of brownout detection cannot be used when a switch mode power converter is used in place of a transformer or capacitive divider circuitry. This is the case because using the switch mode power converter circuitry eliminates key elements normally relied upon to implement a simple, low cost, brownout detection circuit.\nA first problem is that the set voltage tap is not available because of the wide input voltage range topology inherent in the switch mode power converter. Without this tap there is no common reference point that can be used to generate the single voltage level that is proportional to the input voltage regardless of the value of that input voltage. Accordingly, brownout detection circuits that rely on this set voltage tap cannot be used with a switch mode power converter.\nAnother problem is that the change in the secondary output voltage over the wide range of input voltages is tightly regulated and therefore not useful to brownout detection circuits. This means that brownout detection topologies that rely on secondary side outputs can no longer be used to determine the change in the input voltage on the primary side of the circuit in systems that utilize a switch mode power converter.\nAnother complication in implementing common brownout detection techniques with switch mode power converter technology results from the frequent requirement in emergency lighting equipment to isolate the primary and secondary sides of the circuit. Specifically, the brownout detection circuitry located on the primary side of the circuit must communicate or send its output to the circuitry that controls the lamps or lighting on the isolated secondary side"} +{"output_text": " chip. The matrix of conductive leads is then secured relative to the AGA chip by a second insulating carrier. The second insulating carrier is then secured relative to the PWB by a third insulating carrier. The first and second insulating carriers are then removed, and the first ends of the matrix of conductive leads are aligned with the reciprocating matrix of conductive surface pads on the AGA chip. The first ends of the matrix of conductive leads are then secured relative to the reciprocating matrix of conductive surface pads on the AGA", "input_text": " and/or mechanical vibration.\nMoreover, once AGA chip 50 is mounted on PWB 70, accessing a connection point between a single conductive pad on AGA chip 50 and a reciprocal conductive pad on PWB 70 is difficult. When a solder ball joint fails, the entire AGA chip 50 must be removed from PWB 70 in order to effect repairs. While BGA packages have provided space reduction between the chip and PWB, the reliability problems associated with solder joints between semiconductor chips and printed wiring boards have continued.\nOne attempted solution includes the use of solder columns instead of solder balls. The solder columns are typically made of Sn10:Pb90 solder alloy. Although solder columns enhance compliancy, the columns bend easily and often experience problems as a result of handling during production. Solder columns also fail to provide improved strength or reliability over solder balls. In addition, the high lead content of this solder alloy is highly undesirable due to environmental concerns over the introduction of additional lead into the environment.\nAttempts have been made to use a conductive lead to connect an AGA chip to a PWB. For example, U.S. Pat. No. 5,455,390 discloses a method for placing a plurality of conductive connecting leads between the conductive surface pads of the AGA chip and the connecting surface pads of the PWB. However, this method still results in connection failures due to the less reliable type of material, e.g., gold, used to make the conductive connecting leads.\nU.S. Pat. No. 6,000,126 issued to the present inventor discloses an improved method of interconnecting an AGA chip to a PWB. This method includes orienting a first side of a matrix of a plurality of conductive leads, secured relative to one another in parallel by an insulating carrier, so that the first ends of the matrix are aligned with a reciprocating matrix of conductive surface pads on an AGA"} +{"output_text": " against both positive and negative voltage/current pulses, two SCRs are required.\nFIG. 2A shows a circuit schematic representation of a conventional two-SCR ESD protection circuit. As is seen from FIG. 2A, the two-SCR ESD protection circuit includes two SCRs 40 and 42 connected in series between a positive voltage source 44 and a negative voltage source 46. The positive voltage source 44 is connected to a first terminal 48 of the two-SCR E", "input_text": " anode terminal 12 and a cathode terminal 14. FIG. 1B shows a circuit schematic representation of SCR 10. As is seen from FIG. 1B, SCR 10 is composed of an npn bipolar transistor 32, a pnp bipolar transistor 30 and two parasitic resistors 34 and 36. Pnp transistor 30 consists of p+ emitter region 20, n-well region 26 serving as base, and p-substrate region 24 serving as collector. Npn transistor 32 consists of n+ emitter region 22, p-substrate region 24 serving as base, and n-well region 26 serving as collector. Parasitic resistor 34, shown in dashed line in FIG. 1A, is connected to anode terminal 12 via n+ contact portion 27 of n-well 26. Parasitic resistor 36, likewise shown in dashed line in FIG. 1A, is connected to cathode terminal 14 via p+ contact portion 25 of p-substrate 24.\nIn order to turn on SCR 10, a positive voltage must be applied between anode terminal 12 and cathode terminal 14 to forward bias both transistors 30 and 32. When SCR 10 turns on, a low impedance discharge path forms between the two terminals of SCR 10 to discharge the current.\nFIG. 1C shows the current-voltage characteristic of SCR 10. In FIG. 1C, the vertical axis represents the current flow between terminals, and the horizontal axis represents the voltage across terminals 12 and 14. The voltage at which SCR 10 enters the region characterized by a negative current-voltage relationship is called the snap-back or trigger voltage, which is shown in FIG. 1C as Vt.\nA major disadvantage of SCR 10 is that it provides protection against ESD in only one direction, i.e., either against a positive voltage/current pulse or against a negative voltage/current pulse. Consequently, to protect"} +{"output_text": " features. More particularly, the present invention relates to a landscape edging system that is easily installed and removed from a ground surface.\n2. Description of the Related Art\nLandscape edging systems are used to create borders around gardens, flower beds, trees and other landscape features. The landscape edging systems are typically installed around the perimeter of the landscape feature. The landscape edging systems are typically installed by digging a trench around the perimeter of the landscape feature and then filling the trench with landscape ed", "input_text": " patent application CN200920129260.3 discloses a process cartridge with a flexible pressure device. The flexible pressure device is arranged on a photosensitive drum and allows a driving force receiver to stably receive a driving force, so that the driving force receiver has free gap in the rotational axial direction of the photosensitive drum. Therefore, not only the driving force receiver has certain free gap in the rotational axial direction of the photosensitive drum and leans against a driving end of an image forming device to realize the assembly of a toner cartridge in the axial direction of the photosensitive drum but also the coaxial transmission between the driving force receiver and the photosensitive drum is more reliable and the structure is simpler. Moreover, as the driving force receiver is detachably arranged at one end of the photosensitive drum, the photosensitive drum is convenient in maintenance. As different driving force receivers are used for different image forming devices but the main body, namely the photosensitive drum, is the same, users only need to replace the driving force receiver but not need to replace the photosensitive drum, thus the manufacturing cost and the use cost are reduced. However, due to the flexible pressure device, the driving force receiver, namely the driving force receiving opening, is always in the pressurized state when beginning to get engaged and disengaged with a driving mechanism of the image forming device, thus the driving force receiver and the driving member for the image forming device cannot be kept in a straight line when beginning to get engaged and disengaged as the inner space of the image forming device is limited, consequently the driving force receiver and the driving member of the image forming device are inevitably subjected to the friction damage when meeting a bevel when beginning to get engaged and disengaged and then the engagement between the driving force receiver and the driving member of the image forming device is affected. The present invention relates in general to the field of landscape edging for creating borders around gardens, flower beds, trees and other landscape"} +{"output_text": ". The \u00bd ring parts 3 and 4 are arranged at the same intervals as the perforations of the sheets of loose-leaf paper or documents. The \u00bd ring parts 3 and 4 are made of a plastic material and are provided with a plurality of holes 5. The holes 5 are arranged at the same intervals as the perforations of the sheets of loose-leaf paper or documents. The \u00bd ring parts 3 and 4 are arranged at the same intervals as the perforations of the sheets of loose-leaf", "input_text": " of malfunction when the lock is actually needed. In addition, this type of lock locks the ram in exactly the same position every time. This is a disadvantage because the rams may need to be locked in a further closed position as the ram seals wear. Otherwise, the ram may be locked before it has traveled inwardly enough to completely seal off against the drill string. Furthermore, most blowout preventer rams are made to run over the center and when one ram has moved past the locking position the other ram may not have moved enough to be locked.\nA still further disadvantages of either of the above mentioned locking systems is that there is no good way to check whether or not the lock has been effected. The radial latch type lock cannot be checked since applying ram opening pressure would unlatch the lock. Opening pressure can be applied to the wedge type lock to determine whether or not it has been effected, but a low opening pressure will not assure that it is locked since the ram might be slightly hung or stuck and not actually locked. Therefore a high pressure must be applied to be sure the lock is effected and this tends to overload the locking device.\nGreat strides have been made in the development and improvement of blowout preventers. Improvements have also been made in locking such blowout preventers in the closed position. However, it is apparent that the present state of the art in locking blowout preventers still leaves much to be desired in efficiency, reliability and other operating, manufacturing and maintenance characteristics. A plastic binder for binding sheets of loose-leaf paper and documents perforated with multiple-hole paper punchers (refer to JP-A-2000-289376, for example) are known. A binder of the sort mentioned above will be described briefly herein below. FIGS. 20, 21 and 22 show a conventional binder 1 with a number of \u00bd ring parts 3 and 4 arranged at constant intervals"} +{"output_text": " crystals are grown by the Czochralski method. The main disadvantage of this method is a low yield of the Ce:LSO crystals.\nThe U.S. Pat. No. 6,413,311 describes also a method of growing the Ce:LSO crystals by the Czochralski method, where the Ce:LSO crystals are grown in the presence of a flux of the rare earth element. The main disadvantage of this method is a low yield of", "input_text": " or other gamma quantum is a complicate technical task. In this case an afterglow becomes a very undesirable effect, because it reduces an accuracy all system.\nThe afterglow and thermoluminescence phenomena are explored circumstantially for the Ce:LSO crystals (P. Dorenbost, C. van Eijekt, A. Bost, Melcher \u201cAfterglow and thermoluminescence properties of Lu2SiO5:Ce scintillation crystals\u201d, J.Phys.Condens.Matter 6 (1994), pp. 4167\u20134180). According to this article an afterglow is observed both in the crystals having a high light yield and a low light yield, and a conclusion is that an afterglow is a property immanent to the Ce:LSO substance.\nIt is known substance the cerium doped gadolinium oxyorthosilicate, Ce2yGd2(1\u2212x\u2212y)A2xSiO5, where A is at least one element selected from the group La (lanthanum) and Y (yttrium), the x and y values are varied within the limits 0ny, Nz defined by Nz=(nx\u2212nz)/(nx\u2212ny) will be \u22121.0