Details about pre-train data

#41
by rxa615 - opened

Hi,
Thank you for the great work. I have downloaded the ZIP file containing the training data, which includes dev.txt and train.txt. Both files appear to contain DNA sequences. However, there is no clear information on how to map each sequence to a specific species, as described in Table 11 of the paper (https://arxiv.org/abs/2306.15006).

Could you please clarify how to identify the species corresponding to each sequence?

Best regards,
Raghav

Sign up or log in to comment