Stories by og_kalu

Gemini Diffusion

Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names

Over-Tokenized Transformer: Vocabulary Is Generally Worth Scaling

LLMs struggle with perception, not reasoning, in ARC-AGI

EvaByte: Efficient Byte-Level Language Models at Scale

Tell me about yourself: LLMs are aware of their learned behaviors

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought

LLMs struggle with perception, not reasoning, in ARC-AGI

Byte Latent Transformer: Patches Scale Better Than Tokens

Mastering Board Games by External and Internal Planning with Language Models

Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

GameGen-X: Open-World Video Game Generation

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Kurzgesagt: We Fell for the Oldest Lie on the Internet [video]

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-Wise LoRA

Solving Global Lyapunov functions: open problem in mathematics with transformers

ChatGPT Topped 3B Visits in September

Tx-LLM: Supporting therapeutic development with large language models

Tx-LLM: Supporting therapeutic development with large language models

Visual Autoregressive Modeling: Image Generation via Next-Resolution Prediction

1
2
3
4
5
6
7
8
9