Why does page-level encryption waste bandwidth on edge accelerators?

Page-level encryption operates at 4 KB granularity. When a neural network layer accesses a small tensor tile (e.g., 64 bytes), the system must fetch the entire 4 KB page, decrypt it, and extract the needed bytes. This forces unnecessary data movement and cache pollution. Tessera avoids this by decrypting at 64-byte cache-line granularity, matching the actual memory access size.

How does Tessera hide cryptographic latency?

Tessera intercepts 64-byte AXI memory bursts and computes AES-256-CTR keystreams in parallel with the DRAM fetch. By the time the encrypted data arrives from DRAM, the keystream is ready. The decryption XOR happens inline, and plaintext flows directly into the NPU's isolated SRAM. This parallelization means the crypto adds no extra latency beyond the standard DRAM access time.

What attacks does Tessera protect against?

Tessera defends against physical DRAM extraction (an attacker cannot read plaintext weights from memory), rogue DMA (a compromised device cannot access plaintext), compute hijacking (the NPU receives only decrypted data in isolated SRAM), and OS-level attacks (a compromised kernel cannot read plaintext from shared DRAM). It does not address side-channel attacks or key management vulnerabilities.

engineering · 8 min read · Apr 28, 2026

Tessera: Cache-Line Encryption for Edge AI Without Bandwidth Loss

A hardware architecture that decrypts neural network weights at 64-byte granularity, hiding cryptographic overhead within DRAM fetch latency on shared-memory edge accelerators.

Source: arxiv/cs.LG · Animan Naskar · open original ↗

Tessera decrypts DNN weights inline at cache-line granularity, achieving near-zero overhead on UMA edge devices by parallelizing AES-256-CTR with DRAM access.

— UMA systems expose plaintext model weights to OS-level and physical attacks because CPU and NPU share DRAM.
— Page-level encryption (4 KB granularity) wastes bandwidth fetching entire pages for small tensor tiles, incurring up to 32x penalty.
— Tessera intercepts 64-byte AXI bursts and computes AES-256-CTR keystreams in parallel with DRAM fetches, hiding crypto latency.
— Decrypted weights stream directly into isolated NPU SRAM, eliminating permanent memory carve-outs required by trusted execution environments.
— Measured across three SoC platforms, Tessera achieves 98.4% of theoretical bandwidth with only 1.6% overhead.
— Architecture neutralizes DRAM extraction, rogue DMA, and compute hijacking attacks while preventing plaintext leakage across sparse tensors.
— Design maintains constant 1x memory footprint across all layer geometries, unlike page-level schemes that degrade with irregular tensor shapes.

Astrobobo tool mapping

Reading Queue Add the Tessera paper to your queue, prioritizing Section 3 (architecture) and the bandwidth measurement results in Section 5.
Knowledge Capture Document the key insight: crypto latency can be hidden if keystream generation is pipelined with DRAM fetch. Capture the specific timing constraints (64-byte burst, AES-256-CTR, DRAM access time) that make this work.
Focus Brief Summarize the threat model (OS compromise, physical DRAM extraction, rogue DMA) and how Tessera neutralizes each. Note the gaps (side-channel, key management, sparse models).

Frequently asked

Page-level encryption operates at 4 KB granularity. When a neural network layer accesses a small tensor tile (e.g., 64 bytes), the system must fetch the entire 4 KB page, decrypt it, and extract the needed bytes. This forces unnecessary data movement and cache pollution. Tessera avoids this by decrypting at 64-byte cache-line granularity, matching the actual memory access size.

Share X LinkedIn

cite ▸

APA

Animan Naskar. (2026, April 28). Tessera: Cache-Line Encryption for Edge AI Without Bandwidth Loss. Astrobobo Content Engine (rewrite of arxiv/cs.LG). https://astrobobo-content-engine.vercel.app/article/tessera-cache-line-encryption-for-edge-ai-without-bandwidth-loss-df6bf3

MLA

Animan Naskar. "Tessera: Cache-Line Encryption for Edge AI Without Bandwidth Loss." Astrobobo Content Engine, 28 Apr 2026, https://astrobobo-content-engine.vercel.app/article/tessera-cache-line-encryption-for-edge-ai-without-bandwidth-loss-df6bf3. Based on "arxiv/cs.LG", https://arxiv.org/abs/2604.23205.

BibTeX

@misc{astrobobo_tessera-cache-line-encryption-for-edge-ai-without-bandwidth-loss-df6bf3_2026,
  author       = {Animan Naskar},
  title        = {Tessera: Cache-Line Encryption for Edge AI Without Bandwidth Loss},
  year         = {2026},
  url          = {https://astrobobo-content-engine.vercel.app/article/tessera-cache-line-encryption-for-edge-ai-without-bandwidth-loss-df6bf3},
  note         = {Astrobobo rewrite of arxiv/cs.LG, https://arxiv.org/abs/2604.23205},
}

#encryption #edgeai #dnn #hardware #security #bandwidth

Tessera: Cache-Line Encryption for Edge AI Without Bandwidth Loss

Astrobobo tool mapping

Frequently asked

Related insights

Vibe Coding Triggers a Dopamine Loop That Undermines Engineering Judgment

Deterministic Routing Cuts Tail Latency by Aligning Requests With Data

How GCP Architects Should Actually Use Generative AI