Commits · luca-peric/blt-entropy-patcher

Updating decription

0d61fe6

Running

luca-peric commited on 5 days ago

local block causal when cuda avail

a37fec7

luca-peric commited on 5 days ago

xformers when cuda available

661d10b

luca-peric commited on 5 days ago

Xformers pkg version update

86969f4

luca-peric commited on 5 days ago

Removing sphinx version spec

4747368

luca-peric commited on 5 days ago

Adding spaces GPU decorator

4ec9e88

luca-peric commited on 5 days ago

Removing specified CUDA version

d026e7c

luca-peric commited on 5 days ago

Changing HF Space python version

75e90d9

luca-peric commited on 5 days ago

Adding HF ZeroGPU torch version compatible

587f6ed

luca-peric commited on 5 days ago

Revert "Removing CUDA deps"

7642f0e

luca-peric commited on 5 days ago

cleaning things up via gemini 2.5 pro

f2f927b

luca-peric commited on 7 days ago

more finishing touches

ad774a9

luca-peric commited on 7 days ago

adding patch counts and cleaning up

545bc06

luca-peric commited on 7 days ago

tiktoken & llama both plotted

b074257

luca-peric commited on 7 days ago

Patches

a528449

luca-peric commited on 7 days ago

Working locally, TBD HF space

2af55e5

luca-peric commited on 8 days ago

Removing CUDA deps

d52b754

luca-peric commited on 8 days ago

Not sure what happened, more deps

847b7ee

luca-peric commited on 8 days ago

Adding again

c318efd

luca-peric commited on 8 days ago

Removing gcc and gnu deps from req file

9825f33

luca-peric commited on 8 days ago

Init for HF space

570eaa9

luca-peric commited on 8 days ago

Visualisation working on CPU via CUDA_VISIBLE_DEVICE=-1 python demo_patcher.py 'Daenerys Targaryen is in Game of Thrones, a fantasy epic by George R.R. Martin.'

41ea791

luca-peric commited on 8 days ago

Improve HF integration (#98)

1b67cbe
unverified

NielsRogge commited on 17 days ago

Open source weights! (#97)

96d51b5
unverified

par-meta commited on 18 days ago

Cast int sample id to str (#96)

e299427
unverified

Srinivasan Iyer

sviyer commited on 27 days ago

Init distributed when loading model (#94)

138c2f3
unverified

Srinivasan Iyer

sviyer commited on 27 days ago

Fix eval mask (#93)

19a3f75
unverified

Srinivasan Iyer

sviyer commited on 27 days ago

remove selective activation checkpointing (#92)

8c1b1a7
unverified

Srinivasan Iyer

sviyer commited on 28 days ago

update (#91)

1e78a49
unverified

par-meta commited on Apr 2

Get generation working for BLT (#86)

b79eb3e
unverified

par-meta commited on Apr 1

Fix in-place addition of patch_embds (#85)

2dcf48b
unverified

Hanna commited on Mar 20

Some fixes for entropy model predictions (#83)

fc946a1
unverified

Srinivasan Iyer

sviyer commited on Mar 13

Update ppl evals to work with blt model, in addition to entropy model (#82)

083656c
unverified

par-meta commited on Mar 13

Update iterate_data (#81)

f84ee63
unverified

par-meta commited on Mar 13

Add way to call consolidate (#80)

c110f6b
unverified

Srinivasan Iyer

sviyer commited on Mar 11

When merging configs, do not merge data sources (#79)

a5ceaaa
unverified

Srinivasan Iyer

sviyer commited on Mar 11

Get evals working again. (#46)

7517ac2
unverified

par-meta commited on Mar 11

Reduce per file resources arrow uses (#77)

63913e4
unverified

par-meta commited on Mar 5

Let process start before yielding preloaded prefetch buffer, avoid needlessly losing buffer in edge cases (#75)

8f2cf88
unverified

par-meta commited on Mar 5

Add approximate state persistence (#73)

ea1fc75
unverified

par-meta commited on Mar 5

Fix rsync to not preserve original permissions, instead use destination (#76)

9bd51df
unverified

par-meta commited on Mar 5

Correctly reset batch iterator at each arrow create_iter call. (#74)

c727844
unverified

par-meta commited on Mar 4

Pass mask in packing_iterator, correctly handle last batch, fix masking (#65)

08b8c7c
unverified

par-meta commited on Feb 27

Initialize rope embeddings properly for the entropy model (#72)

0da051f
unverified

Srinivasan Iyer

sviyer commited on Feb 25

Remove byte tokenizer and add config args to switch between byte/patch packing (#68)

aeb95f1
unverified

par-meta commited on Feb 25

Add vocab and seq len abstract fields (#66)

ff36aa8
unverified

par-meta commited on Feb 24

Fix: Correct model_args usage in parallelize_model call (#69)

a6ed14f
unverified

Bocheng Li commited on Feb 24

Update iterator inheritance, pass file format args, limit iterator (#63)

fc3399e
unverified

par-meta commited on Feb 22

Make apex logs less noisy (#60)

b0956bd
unverified

par-meta commited on Feb 18

Make it possible to specify multiple config files (#54)

82ab593
unverified

par-meta commited on Feb 18

Commit History

Updating decription 0d61fe6 Running

local block causal when cuda avail a37fec7

xformers when cuda available 661d10b

Xformers pkg version update 86969f4

Removing sphinx version spec 4747368

Adding spaces GPU decorator 4ec9e88

Removing specified CUDA version d026e7c

Changing HF Space python version 75e90d9

Adding HF ZeroGPU torch version compatible 587f6ed

Revert "Removing CUDA deps" 7642f0e

cleaning things up via gemini 2.5 pro f2f927b

more finishing touches ad774a9

adding patch counts and cleaning up 545bc06

tiktoken & llama both plotted b074257

Patches a528449

Working locally, TBD HF space 2af55e5

Removing CUDA deps d52b754

Not sure what happened, more deps 847b7ee

Adding again c318efd

Removing gcc and gnu deps from req file 9825f33

Init for HF space 570eaa9

Visualisation working on CPU via CUDA_VISIBLE_DEVICE=-1 python demo_patcher.py 'Daenerys Targaryen is in Game of Thrones, a fantasy epic by George R.R. Martin.' 41ea791

Improve HF integration (#98) 1b67cbe unverified

Open source weights! (#97) 96d51b5 unverified

Cast int sample id to str (#96) e299427 unverified

Init distributed when loading model (#94) 138c2f3 unverified

Fix eval mask (#93) 19a3f75 unverified

remove selective activation checkpointing (#92) 8c1b1a7 unverified

update (#91) 1e78a49 unverified

Get generation working for BLT (#86) b79eb3e unverified

Fix in-place addition of patch_embds (#85) 2dcf48b unverified

Some fixes for entropy model predictions (#83) fc946a1 unverified

Update ppl evals to work with blt model, in addition to entropy model (#82) 083656c unverified

Update iterate_data (#81) f84ee63 unverified

Add way to call consolidate (#80) c110f6b unverified

When merging configs, do not merge data sources (#79) a5ceaaa unverified

Get evals working again. (#46) 7517ac2 unverified

Reduce per file resources arrow uses (#77) 63913e4 unverified

Let process start before yielding preloaded prefetch buffer, avoid needlessly losing buffer in edge cases (#75) 8f2cf88 unverified

Add approximate state persistence (#73) ea1fc75 unverified

Fix rsync to not preserve original permissions, instead use destination (#76) 9bd51df unverified

Correctly reset batch iterator at each arrow create_iter call. (#74) c727844 unverified

Pass mask in packing_iterator, correctly handle last batch, fix masking (#65) 08b8c7c unverified

Initialize rope embeddings properly for the entropy model (#72) 0da051f unverified

Remove byte tokenizer and add config args to switch between byte/patch packing (#68) aeb95f1 unverified

Add vocab and seq len abstract fields (#66) ff36aa8 unverified

Fix: Correct model_args usage in parallelize_model call (#69) a6ed14f unverified

Update iterator inheritance, pass file format args, limit iterator (#63) fc3399e unverified

Make apex logs less noisy (#60) b0956bd unverified

Make it possible to specify multiple config files (#54) 82ab593 unverified

Updating decription

0d61fe6

Running

local block causal when cuda avail

a37fec7

xformers when cuda available

661d10b

Xformers pkg version update

86969f4

Removing sphinx version spec

4747368

Adding spaces GPU decorator

4ec9e88

Removing specified CUDA version

d026e7c

Changing HF Space python version

75e90d9

Adding HF ZeroGPU torch version compatible

587f6ed

Revert "Removing CUDA deps"

7642f0e

cleaning things up via gemini 2.5 pro

f2f927b

more finishing touches

ad774a9

adding patch counts and cleaning up

545bc06

tiktoken & llama both plotted

b074257

Patches

a528449

Working locally, TBD HF space

2af55e5

Removing CUDA deps

d52b754

Not sure what happened, more deps

847b7ee

Adding again

c318efd

Removing gcc and gnu deps from req file

9825f33

Init for HF space

570eaa9

Visualisation working on CPU via CUDA_VISIBLE_DEVICE=-1 python demo_patcher.py 'Daenerys Targaryen is in Game of Thrones, a fantasy epic by George R.R. Martin.'

41ea791

Improve HF integration (#98)

1b67cbe
unverified

Open source weights! (#97)

96d51b5
unverified

Cast int sample id to str (#96)

e299427
unverified

Init distributed when loading model (#94)

138c2f3
unverified

Fix eval mask (#93)

19a3f75
unverified

remove selective activation checkpointing (#92)

8c1b1a7
unverified

update (#91)

1e78a49
unverified

Get generation working for BLT (#86)

b79eb3e
unverified

Fix in-place addition of patch_embds (#85)

2dcf48b
unverified

Some fixes for entropy model predictions (#83)

fc946a1
unverified

Update ppl evals to work with blt model, in addition to entropy model (#82)

083656c
unverified

Update iterate_data (#81)

f84ee63
unverified

Add way to call consolidate (#80)

c110f6b
unverified

When merging configs, do not merge data sources (#79)

a5ceaaa
unverified

Get evals working again. (#46)

7517ac2
unverified

Reduce per file resources arrow uses (#77)

63913e4
unverified

Let process start before yielding preloaded prefetch buffer, avoid needlessly losing buffer in edge cases (#75)

8f2cf88
unverified

Add approximate state persistence (#73)

ea1fc75
unverified

Fix rsync to not preserve original permissions, instead use destination (#76)

9bd51df
unverified

Correctly reset batch iterator at each arrow create_iter call. (#74)

c727844
unverified

Pass mask in packing_iterator, correctly handle last batch, fix masking (#65)

08b8c7c
unverified

Initialize rope embeddings properly for the entropy model (#72)

0da051f
unverified

Remove byte tokenizer and add config args to switch between byte/patch packing (#68)

aeb95f1
unverified

Add vocab and seq len abstract fields (#66)

ff36aa8
unverified

Fix: Correct model_args usage in parallelize_model call (#69)

a6ed14f
unverified

Update iterator inheritance, pass file format args, limit iterator (#63)

fc3399e
unverified

Make apex logs less noisy (#60)

b0956bd
unverified

Make it possible to specify multiple config files (#54)

82ab593
unverified