Isaac's Tech Notes


Isaac's Tech Digest

Join me for weekly insights on tech, coding, and AI. I share practical tips and learnings from building real-world applications, delivered straight to your inbox.

Late Chunking: The Better Way to Embed Document Chunks

Published on April 9, 2025

Solving the lost context problem in document retrieval with the embed-then-chunk approach

Your RAG system is probably broken due to the “lost context problem”:

When “Berlin” in one chunk can’t link to “3.85M inhabitants” in another, your query about “Berlin’s population” fails

That’s where late chunking comes in.

When you segment documents first, each chunk gets embedded independently, destroying critical context. Pronouns lose referents, terminology becomes ambiguous, and connections between chunks disappear. In the blog post, we visualize this context loss.

The solution? Late Chunking - an embed-then-chunk approach that preserves document-wide context in every chunk.

How it works:

1. Embed the ENTIRE document first

2. Then segment into chunks

3. Each chunk’s embedding retains full document context

Late chunking ensures ALL other tokens influence each token’s embedding in the document. “The city” correctly links to “Berlin” even across chunk boundaries.

My blog post on Late Chunking: The Better Way to Embed Document Chunks includes full details, including code examples.

Thanks so much to Jina AI for the incredible research on this and for making the paper, code implementation, and several blog posts explaining it to the public.

And thanks to Benjamin Clavié and my Dad for reviewing the post and giving really important feedback

Read more...

Stay up to date on all my writing at isaacflath.com

Isaac Flath

Hi! I'm Isaac Flath, a Tech Generalist passionate about creating beautiful, functional, useful things. While this often means AI it often means Web App Development, Dev Ops, System Administration, and other things. AI is only a component (sometimes a relatively small one) of a successful AI application.

Read more from Isaac Flath

Hey this is Isaac, This week has been about closing loops. I'm releasing the DevRelifier (product for helping share and promote tech stuff), which means dealing with the unsexy parts of software development: tax setup and final testing. It’s less about coding creative features and more about ensuring the business actually works without breaking the law or the user experience. If you want to try the product, reply to this email for more info and free credits. The Build: The Unsexy Last Mile...

Hey this is Isaac, I met with some dear friends, Danny and Audrey Roy Greenfeld, in Baltimore for a mini-sprint on the air web-dev framework. Walking around a science museum while brainstorming and cafe-hopping while hacking together was a blast. During the sprint I created a new testing example (unit, integration, and end-to-end playwright tests) and began extracting features from private codebases into small, public libraries. I also learned that a late evening espresso martini is not the...

Hey this is Isaac, This week I was fixing some basic and classic async problems: a "processing" pill that never updated to "complete." Blech. The goal is to prevent long-running tasks from freezing the user interface. If something takes a second, it’s fine to expect the user to stay on the page until it completes. But if it takes 5 minutes? Users will want to be able to do other things while it’s processing. So we have background tasks that run independently of the UI the user sees. But the...