YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Multi-output DNA Structure Regressor (PyTorch)
Description
This model is a multi-output DNA structure regressor built and trained from scratch in PyTorch.
It predicts six structural stability metrics โ including Minimum Free Energy (MFE), number of base pairs, mean stem length, number of stems, number of hairpins, and number of internal loops โ directly from engineered DNA sequence features.
Trained on the [aedupuga/2025-scaffold-structures] dataset, the model provides a fast, lightweight alternative to more complex and time-consuming simulation tools like NUPACK, enabling near-instant predictions for plasmid stability analysis.
Model
- Architecture: 3-layer MLP (512โ256โ128, dropout 0.3)
- Inputs: 109658 features
- Outputs: 6 targets โ mfe_energy, num_pairs, stem_len_mean, num_stems, num_hairpins, num_internal_loops
- Loss: MSE
- Optimizer: Adam (lr=0.0001)
- Epochs: 15
Metrics (test)
- Overall MSE:
15022.6787 - Overall Rยฒ:
-34.0313 - Training time (s):
131.85 - Prediction time (s):
0.2694
MAE per target
{
"mfe_energy": 139.4054718017578,
"num_pairs": 116.53337097167969,
"stem_len_mean": 2.4054114818573,
"num_stems": 69.17422485351562,
"num_hairpins": 14.115099906921387,
"num_internal_loops": 94.97564697265625
}
Usage
pip install torch numpy
python inference.py
Ensure to apply any preprocessing (e.g., scaling, SVD) used during training.
Limitations
- Performance is less reliable for shorter DNA strands, as the training data primarily consists of longer plasmid sequences.
- The model is intended for educational and exploratory research use, not for experimental or clinical validation.
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support