Questions regarding Stack v2 and StarCoder v2

#21

by aditya2211 - opened Mar 23, 2024

Mar 23, 2024

Hi BigCoders,

I had a few questions around Stack v2 and StarCoder v2:
(a) When can we expect the remaining Stack v2 data (documentation etc.) to be released?
(b) For StarCoder v2 pretraining, what was the policy used for packing/chunking? Were documents chunked into multiple segments during pretraining to (a) pack more and (b) to accomodate for longer documents?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment