Post
322
What's currently the biggest gap in Open Source Datasets ??
Join the community of Machine Learners and AI enthusiasts.
Sign UpThis is a bit off-topic, but...
There are datasets and libraries available. There are also trainers and libraries.
However, due to the diverse composition of the datasets (which is great in terms of how the datasets are structured), some people seem to struggle with data preprocessing (ds.rename_columns()
, ds.map()
, ...) or writing DataCollator
(collate_fn
).
It might be helpful to have a guide in an easily accessible location that explains how to actually use the datasets and related topics.