Spaces:
Running
on
Zero
Running
on
Zero
Commit History
local block causal when cuda avail
a37fec7
xformers when cuda available
661d10b
Xformers pkg version update
86969f4
Removing sphinx version spec
4747368
Adding spaces GPU decorator
4ec9e88
Removing specified CUDA version
d026e7c
Changing HF Space python version
75e90d9
Adding HF ZeroGPU torch version compatible
587f6ed
Revert "Removing CUDA deps"
7642f0e
cleaning things up via gemini 2.5 pro
f2f927b
more finishing touches
ad774a9
adding patch counts and cleaning up
545bc06
tiktoken & llama both plotted
b074257
Patches
a528449
Working locally, TBD HF space
2af55e5
Removing CUDA deps
d52b754
Not sure what happened, more deps
847b7ee
Adding again
c318efd
Removing gcc and gnu deps from req file
9825f33
Init for HF space
570eaa9
Visualisation working on CPU via CUDA_VISIBLE_DEVICE=-1 python demo_patcher.py 'Daenerys Targaryen is in Game of Thrones, a fantasy epic by George R.R. Martin.'
41ea791
Improve HF integration (#98)
1b67cbe
unverified
NielsRogge
commited on
Open source weights! (#97)
96d51b5
unverified
Cast int sample id to str (#96)
e299427
unverified
Init distributed when loading model (#94)
138c2f3
unverified
Fix eval mask (#93)
19a3f75
unverified
remove selective activation checkpointing (#92)
8c1b1a7
unverified
update (#91)
1e78a49
unverified
Get generation working for BLT (#86)
b79eb3e
unverified
Fix in-place addition of patch_embds (#85)
2dcf48b
unverified
Hanna
commited on
Some fixes for entropy model predictions (#83)
fc946a1
unverified
Update ppl evals to work with blt model, in addition to entropy model (#82)
083656c
unverified
Update iterate_data (#81)
f84ee63
unverified
Add way to call consolidate (#80)
c110f6b
unverified
When merging configs, do not merge data sources (#79)
a5ceaaa
unverified
Get evals working again. (#46)
7517ac2
unverified
Reduce per file resources arrow uses (#77)
63913e4
unverified
Let process start before yielding preloaded prefetch buffer, avoid needlessly losing buffer in edge cases (#75)
8f2cf88
unverified
Add approximate state persistence (#73)
ea1fc75
unverified
Fix rsync to not preserve original permissions, instead use destination (#76)
9bd51df
unverified
Correctly reset batch iterator at each arrow create_iter call. (#74)
c727844
unverified
Pass mask in packing_iterator, correctly handle last batch, fix masking (#65)
08b8c7c
unverified
Initialize rope embeddings properly for the entropy model (#72)
0da051f
unverified
Remove byte tokenizer and add config args to switch between byte/patch packing (#68)
aeb95f1
unverified
Add vocab and seq len abstract fields (#66)
ff36aa8
unverified
Fix: Correct model_args usage in parallelize_model call (#69)
a6ed14f
unverified
Bocheng Li
commited on