MoL-MoE: Multi-view Mixture-of-Experts framework
This repository provides PyTorch source code of the framework MoL-MoE.
GitHub link: GitHub link
For more information contact: [email protected] or [email protected].
Introduction
We present MoL-MoE, a Multi-view Mixture-of-Experts framework designed to predict molecular properties by integrating latent spaces derived from SMILES, SELFIES, and molecular graphs. Our approach leverages the complementary strengths of these representations to enhance predictive accuracy. Here, we evaluate the performance of MoL-MoE with a total of 12 experts, organized into 4 experts for each modality (SMILES, SELFIES, and molecular graphs).
Table of Contents
Getting Started
This code and environment have been tested on Nvidia V100s
Replicating Conda Environment
Follow these steps to replicate our Conda environment and install the necessary libraries:
Create and Activate Conda Environment
conda create --name mol-moe-env python=3.10
conda activate mol-moe-env
Install Packages with Conda
conda install pytorch=2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia
Install Packages with Pip
pip install -r requirements.txt
Demos
Use the following notebooks depending of the number of activated experts k=4
or k=6
located at:
notebooks/
βββ MoE_FM_Multi_output_BBBP_k=4.ipynb
βββ MoE_FM_Multi_output_BBBP_k=6.ipynb
All used datasets can be found at data/moleculenet/
. To change the dataset, just edit in the notebook the path with the name of the dataset:
train_df = pd.read_csv("../data/moleculenet/bace/train.csv")
valid_df = pd.read_csv("../data/moleculenet/bace/valid.csv")
test_df = pd.read_csv("../data/moleculenet/bace/test.csv")
For single task regression datasets, one may have to change the output dimension output_dim
of the predictor to 1
:
net = Net(smiles_embed_dim=2048, dropout=0.2, output_dim=1)
For multi-task datasets, the output_dim
argument can be edited to the desired number of output predictions.