Fix NotebookLM Context Drowning with zlib Compression

The promise of the 1-million-token context window is intoxicating. Tools like NotebookLM, Claude Pro, and Gemini 1.5 have sold us on a workflow that feels like magic: drop your entire codebase, the corresponding academic paper, and a few web articles into an AI, and ask it to synthesise the whole thing.

Recently, I wanted to do exactly that with Ubisoft's newly open-sourced ComfyUI-Chord—a complex, two-stage neural network framework for generating PBR materials. I wanted the AI to cross-reference the GitHub repository against their SIGGRAPH Asia 2025 paper and tell me what was missing.

So I zipped the repo, converted it to a massive Markdown file, uploaded it to NotebookLM, and asked a simple question about a constant in the code.

The AI confidently replied: “The source code for app.py is not present in the provided documents.”

But I was staring right at it. The code was there. The AI just couldn't “see” it.

Here is how we uncovered the “Context Drowning” problem in modern LLMs, and how a clever hack using zlib compression fixed it—turning a blind AI into an expert architectural reviewer.

The Problem: Attention Poison and the Structure Trap

When you dump a raw codebase into a Retrieval-Augmented Generation (RAG) system, two things immediately break the AI's attention mechanism.

The Lockfile Poison. Modern repos contain massive metadata files (uv.lock, package-lock.json). A 2,000-line lockfile acts as an “attention sink.” The semantic density is so low, and the repetition so high, that the LLM's retrieval engine essentially glazes over, creating a massive “Lost in the Middle” blindspot that swallows the actual application logic located further down the file.

The Code Fence Trap. Standard Markdown code blocks are great for humans, but LLM indexers often use chunking strategies that treat text inside code fences as lower-priority or “unstructured” data compared to plain prose headers.

We needed a way to dynamically filter out the noise and format the code so the AI couldn't possibly ignore it.

The Breakthrough: The zlib Poison Score

Hard-coding ignores for uv.lock or yarn.lock is a game of whack-a-mole. We needed a heuristic that measures semantic density directly.

We realised that lockfiles, boilerplate, and compiled JSONs share one trait: they are highly repetitive, which means they compress incredibly well.

We wrote a simple Python function to evaluate every text file before passing it to the AI. We compress the raw text using zlib and calculate the ratio.

import zlib

def calculate_poison_score(content):
    if not content:
        return 0.0
    raw_bytes = content.encode('utf-8', errors='ignore')
    compressed = zlib.compress(raw_bytes)
    # Returns a score from 0.0 to 1.0
    return round(1.0 - (len(compressed) / len(raw_bytes)), 3)

Human-written logic (Python, C++, TypeScript) typically scores around 0.4 to 0.6. A massive lockfile scores 0.93+. By setting a “Poison Threshold” of 0.90, our pre-processor automatically strips out the structural noise regardless of the framework or language, shrinking the payload size by up to 80% while retaining 100% of the functional logic.

Formatting for the AI: Atomic Multi-Packs

Filtering wasn't enough. We also had to fix the structure. We abandoned the idea of a single monolithic context.md file and built a script to restructure the repo into Atomic Context Packs:

Thematic splitting. We routed files into smaller, numbered buckets ( 01_Project_Overview.md, 02_Application_Logic.md, 04_Infrastructure.md). This allows NotebookLM to isolate its search based on the query type.
XML boundaries. We wrapped files in <source_file path="..."> tags. LLM indexers handle strict XML boundaries much better than loose Markdown headers, ensuring the file path is always semantically bound to the code.
Atomic prose anchors. We parsed the code and injected Markdown ### FUNCTION: [name] headers directly above every function definition. The code becomes a highly indexed knowledge base instead of a wall of fenced text.

The Payoff: An Expert Architectural Review

To test the new ingestion pipeline, I ran the Ubisoft ComfyUI-Chord repo through our packer, uploaded the resulting five themed Markdown files (and a PDF of the visual workflows) into a fresh NotebookLM, alongside the original research paper.

I asked the ultimate test question.

My prompt: “How well does the codebase match up to the ‘Chord: Chain of Rendering Decomposition for PBR Material Estimation’ paper? Are there areas of improvement for the code and/or ComfyUI pipelines to meet the standards of the paper? Conversely, does the paper make claims that cannot be substantiated by the codebase?”

NotebookLM's response:

The ComfyUI codebase aligns closely with the core inference methodologies described in the ‘Chord: Chain of Rendering Decomposition’ paper, specifically mirroring its stage-two material estimation pipeline. However, because the repository is designed strictly as an inference tool for ComfyUI, it omits the training infrastructure necessary to substantiate all of the paper's claims…
How the codebase matches the paper
Two-stage framework: The paper proposes a ‘generate-and-estimate’ pipeline…
Sequential CHORD pipeline: The codebase substantiates this physical logic via functions like compute_approxIrr, compute_approxRouMet, process_irradiance, and find_light_dir.
Normal-to-height integration: The codebase provides a dedicated ChordNormalToHeight node and a normal_to_height.py script that uses a Fast Fourier Transform (solve_poisson_fft) to solve the Poisson equation…
Paper claims unsubstantiated by the codebase
Training mechanics and losses: Because the repository is exclusively an end-user ComfyUI inference wrapper (nodes.py and io.py), none of the training scripts, loss functions (Eq. 7–9), or dataset preparation utilities are present.
Dataset volume: The paper claims the model was trained on a massive dataset of 28,344 materials… The codebase only loads the final compiled chord_v1.safetensors model, meaning the dataset size claims cannot be independently verified.
Areas of improvement for the code/pipelines
Resolution upscaling: The paper evaluates outputs at 1K resolution. Production environments often require 2K/4K… there is no public ‘PBR upscale’ stage included.
Broader material channels: A production-ready pipeline would benefit from generating Ambient Occlusion (AO), specular variations, and clearcoat.

Note how nuanced and accurate the response is. The same model, fed the same source material, but in a structure that respects how its retriever actually works.

Conclusion

The 1M-token window is a massive leap forward, but sending raw files to an LLM is like giving someone a library with no Dewey Decimal system and half the books filled with random metadata. By applying a simple zlib compression heuristic and structuring codebases into XML-bounded thematic “atomic packs,” we can bypass RAG retrieval failures entirely.

If you want to pack your own repositories for NotebookLM or Claude, we open-sourced the Streamlit utility we built to do this. You can find RepoContextPack on GitLab. Stop drowning your AI in noise, and start getting actual insights out of your code.

Key Takeaways

1.Lockfiles and boilerplate are attention poison. Repetitive, low-density files act as “sinks” that swallow the retrieval engine's attention and create “Lost in the Middle” blindspots in the code that follows.
2.zlib compression ratio is a language-agnostic proxy for semantic density. A “Poison Threshold” of 0.90 filters structural noise without per-framework allowlists, cutting payload size by up to 80% while keeping 100% of the functional logic.
3.XML boundaries beat Markdown code fences. Wrapping files in <source_file path="..."> tags keeps the file path semantically bound to the code, something most LLM chunkers handle far better than fenced blocks.
4.Thematic atomic packs let the LLM isolate. Splitting one monolithic context.md into numbered, themed packs lets the retriever narrow its search by query type instead of scanning everything.
5.Prose anchors turn code into an indexed knowledge base. Injecting ### FUNCTION: [name] headers above every definition gives the indexer something it actually likes to retrieve.

So I zipped the repo, converted it to a massive Markdown file, uploaded it to NotebookLM, and asked a simple question about a constant in the code.

The AI confidently replied: “The source code for app.py is not present in the provided documents.”

But I was staring right at it. The code was there. The AI just couldn't “see” it.

Here is how we uncovered the “Context Drowning” problem in modern LLMs, and how a clever hack using zlib compression fixed it—turning a blind AI into an expert architectural reviewer.

The Problem: Attention Poison and the Structure Trap

When you dump a raw codebase into a Retrieval-Augmented Generation (RAG) system, two things immediately break the AI's attention mechanism.

We needed a way to dynamically filter out the noise and format the code so the AI couldn't possibly ignore it.

The Breakthrough: The zlib Poison Score

Hard-coding ignores for uv.lock or yarn.lock is a game of whack-a-mole. We needed a heuristic that measures semantic density directly.

We realised that lockfiles, boilerplate, and compiled JSONs share one trait: they are highly repetitive, which means they compress incredibly well.

We wrote a simple Python function to evaluate every text file before passing it to the AI. We compress the raw text using zlib and calculate the ratio.

import zlib

def calculate_poison_score(content):
    if not content:
        return 0.0
    raw_bytes = content.encode('utf-8', errors='ignore')
    compressed = zlib.compress(raw_bytes)
    # Returns a score from 0.0 to 1.0
    return round(1.0 - (len(compressed) / len(raw_bytes)), 3)

Formatting for the AI: Atomic Multi-Packs

Filtering wasn't enough. We also had to fix the structure. We abandoned the idea of a single monolithic context.md file and built a script to restructure the repo into Atomic Context Packs:

Thematic splitting. We routed files into smaller, numbered buckets ( 01_Project_Overview.md, 02_Application_Logic.md, 04_Infrastructure.md). This allows NotebookLM to isolate its search based on the query type.
XML boundaries. We wrapped files in <source_file path="..."> tags. LLM indexers handle strict XML boundaries much better than loose Markdown headers, ensuring the file path is always semantically bound to the code.
Atomic prose anchors. We parsed the code and injected Markdown ### FUNCTION: [name] headers directly above every function definition. The code becomes a highly indexed knowledge base instead of a wall of fenced text.

The Payoff: An Expert Architectural Review

I asked the ultimate test question.

NotebookLM's response:

The ComfyUI codebase aligns closely with the core inference methodologies described in the ‘Chord: Chain of Rendering Decomposition’ paper, specifically mirroring its stage-two material estimation pipeline. However, because the repository is designed strictly as an inference tool for ComfyUI, it omits the training infrastructure necessary to substantiate all of the paper's claims…
How the codebase matches the paper
Two-stage framework: The paper proposes a ‘generate-and-estimate’ pipeline…
Sequential CHORD pipeline: The codebase substantiates this physical logic via functions like compute_approxIrr, compute_approxRouMet, process_irradiance, and find_light_dir.
Normal-to-height integration: The codebase provides a dedicated ChordNormalToHeight node and a normal_to_height.py script that uses a Fast Fourier Transform (solve_poisson_fft) to solve the Poisson equation…
Paper claims unsubstantiated by the codebase
Training mechanics and losses: Because the repository is exclusively an end-user ComfyUI inference wrapper (nodes.py and io.py), none of the training scripts, loss functions (Eq. 7–9), or dataset preparation utilities are present.
Dataset volume: The paper claims the model was trained on a massive dataset of 28,344 materials… The codebase only loads the final compiled chord_v1.safetensors model, meaning the dataset size claims cannot be independently verified.
Areas of improvement for the code/pipelines
Resolution upscaling: The paper evaluates outputs at 1K resolution. Production environments often require 2K/4K… there is no public ‘PBR upscale’ stage included.
Broader material channels: A production-ready pipeline would benefit from generating Ambient Occlusion (AO), specular variations, and clearcoat.

Note how nuanced and accurate the response is. The same model, fed the same source material, but in a structure that respects how its retriever actually works.

Conclusion

Key Takeaways

1.Lockfiles and boilerplate are attention poison. Repetitive, low-density files act as “sinks” that swallow the retrieval engine's attention and create “Lost in the Middle” blindspots in the code that follows.
2.zlib compression ratio is a language-agnostic proxy for semantic density. A “Poison Threshold” of 0.90 filters structural noise without per-framework allowlists, cutting payload size by up to 80% while keeping 100% of the functional logic.
3.XML boundaries beat Markdown code fences. Wrapping files in <source_file path="..."> tags keeps the file path semantically bound to the code, something most LLM chunkers handle far better than fenced blocks.
4.Thematic atomic packs let the LLM isolate. Splitting one monolithic context.md into numbered, themed packs lets the retriever narrow its search by query type instead of scanning everything.
5.Prose anchors turn code into an indexed knowledge base. Injecting ### FUNCTION: [name] headers above every definition gives the indexer something it actually likes to retrieve.

How We Used zlib Compression Ratios to Stop NotebookLM from Drowning in Codebase Noise

The Problem: Attention Poison and the Structure Trap

The Breakthrough: The zlib Poison Score

Formatting for the AI: Atomic Multi-Packs

The Payoff: An Expert Architectural Review

Conclusion

Key Takeaways

The Same Principles, Applied to AI-Generated Assets

How We Used zlib Compression Ratios to Stop NotebookLM from Drowning in Codebase Noise

The Problem: Attention Poison and the Structure Trap

The Breakthrough: The zlib Poison Score

Formatting for the AI: Atomic Multi-Packs

The Payoff: An Expert Architectural Review

Conclusion

Key Takeaways

The Same Principles, Applied to AI-Generated Assets

The Problem: Attention Poison and the Structure Trap

The Breakthrough: The zlib Poison Score

Formatting for the AI: Atomic Multi-Packs

The Payoff: An Expert Architectural Review

Conclusion

Key Takeaways

The Same Principles, Applied to AI-Generated Assets

Related Content

The Problem: Attention Poison and the Structure Trap

The Breakthrough: The zlib Poison Score

Formatting for the AI: Atomic Multi-Packs

The Payoff: An Expert Architectural Review

Conclusion

Key Takeaways

The Same Principles, Applied to AI-Generated Assets

Related Content