Translation Small T5
Trained on 2048 context length, able to translate malay, english, javanese, banjarese and indonesian to target language. It also able to maintain the text structure as it is and only translate necessary texts, eg, programming code.
Added more coding translation dataset and do heavy postfilter.
how-to
from transformers import T5ForConditionalGeneration, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
'mesolitica/translation-t5-small-standard-bahasa-cased-code',
use_fast=False
)
model = T5ForConditionalGeneration.from_pretrained(
'mesolitica/translation-t5-small-standard-bahasa-cased-code '
)
answer = """
First, let's start with implementing the `is_number` function, which checks whether the given Variant is number type or not. It checks the type of the Variant and returns whether it is an integer or a real number.
```cpp
#include <cmath>
#include <string>
namespace godot {
// ... (other Variant function declarations here)
class Variant {
// ... (other Variant function definitions here)
public:
bool is_number() const {
switch (get_type()) {
case INT:
case REAL:
return true;
default:
return false;
}
}
// ... (other Variant function definitions here)
};
} // namespace godot
Next, we can create the apply_operation
function, which takes a Variant
object, an operation (either "+" or "-"), and a Variant
operand as the input, applying the given operation to the original Variant
object and the operand if the original Variant
object is a number. If the Variant
object is not a number, the function should throw an exception.
#include <stdexcept>
#include <string>
// ... (other Variant and function declarations here)
Variant apply_operation(Variant a, const std::string& operation, Variant b) {
if (!a.is_number()) {
throw std::invalid_argument("The first Variant is not a number.");
}
if (operation != "+" && operation != "-") {
throw std::invalid_argument("Invalid operation. Supported operations are '+' and '-'.");
}
if (a.get_type() == INT) {
if (b.is_number()) {
if (b.get_type() == INT) {
a = a.operator int() + b.operator int();
} else { // REAL
a = a.operator int() + b.operator double();
}
} else {
throw std::invalid_argument("The second Variant is not a number.");
}
} else { // REAL
if (b.is_number()) {
if (b.get_type() == INT) {
a = a.operator double() + b.operator int();
} else { // REAL
a = a.operator double() + b.operator double();
}
} else {
throw std::invalid_argument("The second Variant is not a number.");
}
}
if (operation == "-") {
a = -a;
}
return a;
}
"""
input_ids = tokenizer.encode(f'terjemah ke Melayu: {answer.strip()}', return_tensors = 'pt').cuda() outputs = model.generate(input_ids, max_length = 512) outputs = [o for o in outputs[0] if o not in [0, 1, 2]] print(tokenizer.decode(outputs, spaces_between_special_tokens = False, skip_special_tokens = False))
Pertama, mari kita mulakan dengan melaksanakan fungsi is_number
, yang memeriksa sama ada Variant yang diberikan adalah jenis nombor atau tidak. Ia memeriksa jenis Variant dan mengembalikan sama ada ia adalah integer atau nombor sebenar.
#include <cmath>
#include <string>
namespace godot {
//... (deklarasi fungsi Variant lain di sini)
class Variant {
//... (definisi fungsi Variant lain di sini)
public:
bool is_number() const {
switch (get_type()) {
case INT:
case REAL:
return true;
default:
return false;
}
}
//... (definisi fungsi Variant lain di sini)
};
} // namespace godot
Seterusnya, kita boleh membuat fungsi apply_operation
, yang mengambil objek Variant
, operasi (sama ada "+" atau "-"), dan operand Variant
sebagai input, menerapkan operasi yang diberikan ke objek Variant
asal dan operand jika objek Variant
asal adalah nombor. Jika objek Variant
bukan nombor, fungsi harus melemparkan pengecualian.
#include <stdexcept>
#include <string>
//
import torch
- Downloads last month
- 3