Cookie Consent by Free Privacy Policy Generator 📌 MambaBit. The most cursed LLM?


✅ MambaBit. The most cursed LLM?


💡 Newskategorie: Programmierung
🔗 Quelle: dev.to

Modern tokenizers come in all form! Some, like qwen, support ~150 000 tokens.

Byte level models support 256 tokens.

Can we go lower?

(There should be "you were so busy asking if you could" meme, but dev.to complains)

But the answer is yes. MambaBit comes with just 2 tokens. One token for bit 0, one token for bit 1. That's it. Yet somehow it still produces something which is not completely random.

Behold the most cursed becomes

Behold the most cursed of men.

LEONTES:
Now means means me not so much as my father,
In the good many lord, and my father come.

It still learned words! And line breaks, names that precede normal text! Yet, even on bit level it produces words rather than words! It's too cursed!

Also we can go even lower and fully embrace the bitness: layers like nn.LayerNorm have built-in bias, which means bit 1 can be encoded as parms_for_1 + bias from normalization and bit 0 = bias from normalization. Say bye to nn.Embedding!

Same with lm_head. As of now model produces output which is essentially [X, -X] where usually is in range -3...3.

...

✅ MambaBit. The most cursed LLM?


📈 78.37 Punkte

✅ ST-LLM: An Effective Video-LLM Baseline with Spatial-Temporal Sequence Modeling Inside LLM


📈 34.33 Punkte

✅ What is the most cursed part of GNU/Linix as an OS?


📈 28.62 Punkte

✅ Bisheng: An Open-Source LLM DevOps Platform Revolutionizing LLM Application Development


📈 22.88 Punkte

✅ Microsoft Researchers Propose Low-Code LLM: A Novel Human-LLM Interaction Pattern


📈 22.88 Punkte

✅ LLM Token Pricing, LLM Tokenomics


📈 22.88 Punkte

✅ Fine-tuning an LLM model with H2O LLM Studio to generate Cypher statements


📈 22.88 Punkte

✅ LLM Security: Bypassing LLM Safeguards


📈 22.88 Punkte

✅ How do you evaluate an LLM? Try an LLM.


📈 22.88 Punkte

✅ Introduction to LLM Ops: Reliable and Scalable LLM Integration


📈 22.88 Punkte

✅ Time-LLM: Reprogram an LLM for Time Series Forecasting


📈 22.88 Punkte

✅ Reframing LLM ‘Chat with Data’: Introducing LLM-Assisted Data Recipes


📈 22.88 Punkte

✅ How We Generated a 10K Dataset Using LLM to Fine-Tune Another LLM


📈 22.88 Punkte

✅ Cursed - Die Auserwählte: Release, Handlung und Trailer zu Staffel 1


📈 21.94 Punkte

✅ HARry Parser and the Cursed Tracker: Breaking the Spell of Online Data Collection


📈 21.94 Punkte

✅ Cursed to Golf Review (PC)


📈 21.94 Punkte

✅ This mod transforms Halo: Combat Evolved into a cursed nightmare


📈 21.94 Punkte

✅ Epic Games: Cursed to Golf ist derzeit gratis erhältlich


📈 21.94 Punkte

✅ Neues Serien-Highlight von Netflix: Was Sie über "Cursed" wissen sollten


📈 21.94 Punkte

✅ Deutscher Entwicklerpreis für 'Shadow Gambit: The Cursed Crew'


📈 21.94 Punkte

✅ Torchlight III Update Adds New Cursed Captain Class, New Pets, Legendary Armor


📈 21.94 Punkte

✅ Cursed ibus. I though `-r` would kill the old one...


📈 21.94 Punkte

✅ Shadow Gambit: The Cursed Crew Review (PC)


📈 21.94 Punkte

✅ Ubuntu minesweeper is cursed


📈 21.94 Punkte

✅ Preorders are now open for Stranded Sails: Explorers of the Cursed Islands


📈 21.94 Punkte











matomo

Datei nicht gefunden!