Writing a Paper with crossmem
An end-to-end playbook for AI agents (Claude Code, Cursor, etc.) and their human authors. You have crossmem installed and the MCP server registered. You want to cite prior work correctly and quote-faithfully.
1. One-time setup
Install crossmem and its dependencies:
# Install crossmem
cargo install --path .
# Or, from the repo directly:
# cargo install --git https://github.com/crossmem/crossmem-rs
# Local LLM for paraphrase/implication generation
ollama pull llama3.2:3b
# PDF parser (preferred — produces bounding boxes)
pip install marker-pdf
# Fallback: brew install poppler (provides pdftotext)
Register the MCP server so your agent can call crossmem_cite and crossmem_recall:
Claude Code:
claude mcp add crossmem -- crossmem mcp serve
Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"crossmem": {
"command": "crossmem",
"args": ["mcp", "serve"]
}
}
}
2. Capturing a paper
crossmem capture https://arxiv.org/abs/1706.03762
Output:
[capture] arxiv_id: 1706.03762
[capture] title: Attention Is All You Need
[capture] cite_key: vaswani2017attention
[capture] saved to ~/crossmem/raw/1776227254_vaswani2017attention.pdf
This does three things:
- Downloads the PDF to
~/crossmem/raw/<timestamp>_<cite_key>.pdf - Fetches metadata from arXiv, CrossRef, and OpenAlex — reconciles across all three
- Generates a deterministic cite key via the pattern DSL
Then compile it into a wiki entry:
crossmem compile vaswani2017attention
This parses the PDF (Marker by default), splits it into chunks, runs each through Ollama for paraphrase and implication, and emits the wiki note at ~/crossmem/wiki/<timestamp>_vaswani2017attention.md.
Note: YouTube ingestion is design-only — see YouTube Ingestion Pipeline.
Capturing non-arXiv papers
Most journal papers (e.g. JCP, Nature, PRL) are not on arXiv. crossmem capture supports them through DOI lookup and local PDF import.
If you have a DOI — CrossRef metadata is fetched automatically:
# DOI URL
crossmem capture https://doi.org/10.1063/5.0012345
# Bare DOI
crossmem capture 10.1063/5.0012345
If the paper is open-access, the PDF downloads via Unpaywall. Otherwise you’ll get instructions to download it manually.
If you already have the PDF — the most common path for paywalled journals:
# With DOI (recommended — gets full CrossRef metadata)
crossmem capture ~/Downloads/smith2023.pdf --doi 10.1063/5.0012345
# Without DOI — extracts what it can from PDF metadata
crossmem capture ~/Downloads/smith2023.pdf --cite-key smith2023transport
Direct PDF URL — for preprint servers, institutional repos:
crossmem capture https://chemrxiv.org/paper.pdf --doi 10.1234/chemrxiv.5678
All paths produce the same raw/ + .meta.json output. Then compile as usual:
crossmem compile smith2023transport
For a JCP submission with 24 references, a typical workflow is:
# Capture each reference — most will be local PDFs with DOIs
for pdf in ~/papers/jcp-refs/*.pdf; do
doi=$(grep -oP '10\.\d{4,9}/[^\s]+' <<< "$(pdftotext "$pdf" - | head -5)")
crossmem capture "$pdf" --doi "$doi"
done
# Then compile each one
for meta in ~/crossmem/raw/*.meta.json; do
key=$(jq -r .cite_key "$meta")
crossmem compile "$key"
done
3. The compiled wiki entry — what the agent sees
Frontmatter
---
cite_key: vaswani2017attention
title: "Attention Is All You Need"
authors:
- "Ashish Vaswani"
- "Noam Shazeer"
year: 2017
arxiv_id: "1706.03762"
doi: "10.48550/arXiv.1706.03762"
captured_at: "1776227254"
raw: "~/crossmem/raw/1776227254_vaswani2017attention.pdf"
pdf_sha256: "9a8f3b..."
parser: "marker"
chunks: 47
meta:
sources: ["arxiv", "crossref", "openalex"]
reconciled: true
warnings: []
---
After the frontmatter, five citation formats are pre-generated: APA, MLA, Chicago, IEEE, and BibTeX.
Chunks
Each chunk carries verbatim text, LLM-generated derivatives, and full provenance:
<!-- chunk id=p4s32c1 -->
> The dominant sequence transduction models are based on complex recurrent or
> convolutional neural networks that include an encoder and a decoder.
**Paraphrase:** Prior sequence models relied on RNNs or CNNs in an encoder-decoder setup.
**Implication:** This dependency on recurrence was the bottleneck the Transformer aimed to eliminate.
```yaml
provenance:
page: 4
section: "3.2 Scaled Dot-Product Attention"
bbox: [72.0, 340.5, 523.8, 412.1]
text_sha256: "5f3e1c..."
byte_range: [18342, 19104]
```
Hard rule for agents: The > blockquote is the verbatim original extracted from the PDF. When citing, the agent MUST copy from this blockquote. NEVER fabricate or rephrase quotes. The Paraphrase and Implication fields exist for the agent’s reasoning and search — they do not belong in the paper as attributed quotes.
4. Agent prompts that actually work
Finding relevant chunks
“Search my library for how transformer attention was originally motivated. Return cite_keys and page numbers.”
Agent calls:
crossmem_recall("transformer attention motivation", limit=5)
Returns a ranked list of {cite_key, title, section, excerpt}. The agent picks the most relevant hits and reports them.
Quoting with provenance
“Write a paragraph introducing self-attention. Quote vaswani2017attention page 2 verbatim, then paraphrase in my voice. Include BibTeX.”
Agent workflow:
- Calls
crossmem_recall("self-attention vaswani2017attention")to find the right chunk - Reads the wiki file to locate the page-2 chunk
- Copies the
>blockquote verbatim into the draft as a block quote - Writes a surrounding paraphrase in the author’s voice (informed by the
Paraphrasefield, not copying it) - Calls
crossmem_cite("vaswani2017attention", "bibtex")for the BibTeX entry - Embeds the
text_sha256and page reference as an HTML comment socrossmem verifycan trace provenance:
% crossmem: vaswani2017attention p4s32c1 sha256=5f3e1c...
\begin{quote}
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder.
\end{quote}
\cite{vaswani2017attention}
Citing multiple papers
“Compare how Vaswani 2017 and Devlin 2019 frame the importance of pre-training.”
Agent calls crossmem_recall("pre-training importance"), gets hits from both papers, reads the relevant chunks, and writes a comparison paragraph quoting both — each quote traced to its chunk ID.
Running a drift check
After the human edits the draft (or the agent revises it), verify that no quotes have been accidentally mutated:
crossmem verify
Output when clean:
[verify] checked 94 chunks across 3 wiki entries
[verify] 0 drifts detected
Output when a quote was altered:
[verify] DRIFT in vaswani2017attention chunk p4s32c1
expected: 5f3e1c...
actual: a1b2c3...
[verify] 1 drift detected
Exit code 1 means drift — the agent or human must restore the original quote from the wiki.
Building the bib file
Collect all \cite{...} keys from a LaTeX draft and emit a single .bib:
grep -oP '\\cite\{[^}]+\}' draft.tex \
| sed 's/\\cite{//;s/}//' \
| tr ',' '\n' \
| sort -u \
| while read key; do crossmem mcp serve <<< "{\"method\":\"tools/call\",\"params\":{\"name\":\"crossmem_cite\",\"arguments\":{\"cite_key\":\"$key\",\"format\":\"bibtex\"}}}"; done
Or, have the agent do it: “Collect every cite key from my draft and produce a references.bib file using crossmem_cite.”
5. What crossmem protects against
| Failure mode | How crossmem prevents it |
|---|---|
| Hallucinated citation metadata | Multi-source reconciliation: arXiv + CrossRef + OpenAlex, ≥2 must agree. Disagreements surface as warnings in frontmatter. |
| Hallucinated quotes | Agent contract: never compose original text, only copy the > blockquote. crossmem verify catches any post-hoc mutation via SHA-256 re-hashing. |
| Wrong page numbers | Every chunk carries page, section, and bbox — the reader can trace back to the exact PDF region. |
| Lost context | byte_range preserves the exact location in the raw PDF. Chunks retain their section heading for navigation. |
| Cite key collisions | Deterministic pattern DSL with a–z suffix tiebreaker (then _<count> if all 26 are taken). |
6. Limits
Be honest about what crossmem cannot do today:
- Scanned / image-only PDFs: Marker’s OCR quality varies. Chunks from poorly scanned pages may have garbled text.
- Math-heavy pages: The pipeline does not run Nougat or other math-aware extractors. Equations may appear as lossy Unicode approximations or be missing entirely.
- Non-arXiv sources: Journal papers captured via DOI or local PDF have single-source metadata (CrossRef only), so there is no cross-verification. Books and conference proceedings with non-standard DOIs may produce incomplete frontmatter.
- Single-author workflow: There is no shared library, sync, or multi-user conflict resolution. Each machine has its own
~/crossmem/directory. - Ollama dependency: Compile requires a running Ollama instance. If Ollama is down or the model is missing, compile will fail.
7. Minimal paper-writing session
A scripted walkthrough — capture two papers, write an intro paragraph, verify.
# Capture two papers
crossmem capture https://arxiv.org/abs/1706.03762
crossmem compile vaswani2017attention
crossmem capture https://arxiv.org/abs/1810.04805
crossmem compile devlin2019bert
Now prompt the agent:
“Write an introductory paragraph for my Related Work section. It should cite both vaswani2017attention and devlin2019bert, quoting one key sentence from each verbatim. Output LaTeX with \cite commands and the BibTeX entries.”
The agent:
- Calls
crossmem_recall("attention mechanism transformer", limit=5)andcrossmem_recall("pre-training bidirectional", limit=5) - Reads the wiki entries for both papers, selects one chunk each
- Produces:
The Transformer architecture replaced recurrence with self-attention:
\begin{quote}
``The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder.''
\end{quote}
\cite{vaswani2017attention}. Building on this,
BERT demonstrated that bidirectional pre-training could be applied to a wide
range of NLP tasks:
\begin{quote}
``We introduce a new language representation model called BERT, which stands
for Bidirectional Encoder Representations from Transformers.''
\end{quote}
\cite{devlin2019bert}.
% crossmem: vaswani2017attention p1s0c1 sha256=...
% crossmem: devlin2019bert p1s0c1 sha256=...
- Calls
crossmem_cite("vaswani2017attention", "bibtex")andcrossmem_cite("devlin2019bert", "bibtex")to emitreferences.bib
Finally, verify nothing drifted:
crossmem verify
# [verify] checked 94 chunks across 2 wiki entries
# [verify] 0 drifts detected
The quotes in your LaTeX match the raw PDFs. Ship it.