Stories by og_kalu

Multimodal Neurons in Pretrained Text-Only Transformers

From Sparse to Soft Mixtures of Experts. Outperforms Dense/Sparse models

Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

Communicative LLM Agents for Software Development

GPT-4 Vision

Generating songs with coherent speech and sound effects

Does Visual Pretraining Help End-to-End Reasoning?

One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Model card and evaluations for Claude models [pdf]

Large Language Models can complete complex non linguistic patterns in context

Large Language Models as General Pattern Machines

Teaching Arithmetic to Small Transformers

GPT-4 solves Mystery-o-Matic's Mystery Puzzle of the day

XTrimoPGLM: Unified 100B-Scale Transformer for Deciphering the Protein Language

Instruct tuned Mixture of Experts LLMs significantly surpass dense counterparts

Building Cooperative Embodied Agents Modularly with Large Language Models

KokoMind: Can LLMs Understand Social Interactions?

KokoMind: Can LLMs Understand Social Interactions?

LongNet: Scaling Transformers to 1B Tokens

2
3
4
5
6
7
8
9