Upload LSTM model and tokenizer

Browse files

Files changed (10) hide show

.gitattributes +2 -0
README.md +116 -0
config.json +7 -0
desktop.ini +2 -0
lstm_model/fingerprint.pb +3 -0
lstm_model/keras_metadata.pb +3 -0
lstm_model/saved_model.pb +3 -0
lstm_model/variables/variables.data-00000-of-00001 +3 -0
lstm_model/variables/variables.index +0 -0
tokenizer.json +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+lstm_model/variables/variables.data-00000-of-00001 filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,116 @@

+---
+tags:
+- text-generation
+- lstm
+- tensorflow
+library_name: tensorflow
+pipeline_tag: text-generation
+---
+# LSTM Text Generation Model
+This model was trained using TensorFlow/Keras for financial article generation tasks.
+## Model Details
+- **Model Type**: LSTM
+- **Framework**: TensorFlow/Keras
+- **Task**: Text Generation
+- **Vocabulary Size**: 30000
+- **Architecture**: Bi-directional Long Short-Term Memory (LSTM)
+## Usage
+```python
+from huggingface_hub import snapshot_download
+import tensorflow as tf
+import json
+import pickle
+import numpy as np
+# Download model files
+model_path = snapshot_download(repo_id="firobeid/L4_LSTM_financial_News_Headlines_generator")
+# Load the LSTM model
+model = tf.keras.models.load_model(f"{model_path}/lstm_model")
+# Load tokenizer
+try:
+    # Try JSON format first
+    with open(f"{model_path}/tokenizer.json", 'r', encoding='utf-8') as f:
+        tokenizer_json = f.read()
+    tokenizer = tf.keras.preprocessing.text.tokenizer_from_json(tokenizer_json)
+except FileNotFoundError:
+    # Fallback to pickle format
+    with open(f"{model_path}/tokenizer.pkl", 'rb') as f:
+        tokenizer = pickle.load(f)
+# Text generation function
+import numpy as np
+from tensorflow.keras.preprocessing.sequence import pad_sequences
+def preprocess(texts, max_sequence_length = 71):
+    texts = '<s> %s'.format(texts.lower())
+    X = np.array(tokenizer.texts_to_sequences([texts])) # REMOVE -1
+    pad_encoded = pad_sequences(X,
+                                 maxlen= max_sequence_length,
+                                 padding='pre')
+    return pad_encoded
+def next_word(model, tokenizer,
+              text, num_gen_words=1,
+              randome_sampling = False,
+              temperature=1):
+    '''
+    Randome_Sampling : Using a categorical distribution to predict the character returned by the model
+    Low temperatures results in more predictable text.
+    Higher temperatures results in more surprising text.
+    Experiment to find the best setting.
+    '''
+    input_text = text
+    output_text = [input_text]
+    for i in range(num_gen_words):
+        X_new = preprocess(input_text)
+        if randome_sampling:
+            y_proba = model.predict(X_new, verbose = 0)[0, -1:, :]#first sentence, last token
+            rescaled_logits = tf.math.log(y_proba) / temperature
+            pred_word_ind = tf.random.categorical(rescaled_logits, num_samples=1) #REMOVE THIS + 1
+            pred_word = tokenizer.sequences_to_texts(pred_word_ind.numpy())[0]
+        else:
+            y_proba = model.predict(X_new, verbose=0)[0]  #first sentence
+            pred_word_ind = np.argmax(y_proba, axis = -1) #REMOVE THIS + 1
+            pred_word = tokenizer.index_word[pred_word_ind[-1]]
+        input_text += ' ' + pred_word
+        output_text.append(pred_word)
+        if pred_word == '</s>':
+            return ' '.join(output_text)
+    return ' '.join(output_text)
+def generate_text(model, tokenizer, text, num_gen_words=25, temperature=1, random_sampling=False):
+    return next_word(model, tokenizer, text, num_gen_words, random_sampling, temperature)
+# Example usage
+# Start with these tag: <s>, while keeping words in lower case
+generate_text(model,
+              tokenizer,
+              "Apple",
+              num_gen_words = 10,
+              random_sampling = True,
+              temperature= 10)
+```
+## Training
+This model was trained on text data using LSTM architecture for next-word prediction.
+## Limitations
+- Model performance depends on training data quality and size
+- Generated text may not always be coherent for longer sequences
+- Model architecture is optimized for the specific vocabulary it was trained on

config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "model_type": "lstm",
+  "framework": "tensorflow",
+  "task": "text-generation",
+  "vocab_size": 30000,
+  "max_sequence_length": 71
+}

desktop.ini ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ [.ShellClassInfo]
2	+ IconResource=C:\Program Files\Google\Drive File Stream\108.0.1.0\GoogleDriveFS.exe,26

lstm_model/fingerprint.pb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:438c6b8d68eefefcd45426c3ac4f4ae5fd3cd2dd2181ac88fa3b5007f62f4587
+size 55

lstm_model/keras_metadata.pb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4ccd43f3683507d23c448f49a4d7d1b9d57f32f6379d9ff3499e8f615f111eef
+size 30349

lstm_model/saved_model.pb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0ccb06635318ba4810da725dea3a917867106cbcb0ab2f9c5494d5bcc043776
+size 12469352

lstm_model/variables/variables.data-00000-of-00001 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d212bec25afb2a4d0a2e53776debb686f758a2e08769a088bafe6a27fbd00407
+size 84689920

lstm_model/variables/variables.index ADDED Viewed

Binary file (1.55 kB). View file

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fba7c276e7f9a0e8a02881b676f0f6e7e9f984221508cc289a8e6a9c8f675842
+size 18883354