Remove sample content from site

2026-03-18 18:16:01 +08:00
parent 4c8fabb8d4
commit 9d81547d46
7 changed files with 14 additions and 136 deletions
--- a/docs/papers/attention-is-all-you-need.md
+++ b/docs/papers/attention-is-all-you-need.md
@@ -1,63 +0,0 @@
---
-title: Attention Is All You Need
-authors: Ashish Vaswani et al.
-year: 2017
-venue: NeurIPS
-tags:
-  - transformer
-  - attention
-  - sequence-modeling
-status: published
---
-
-# Attention Is All You Need
-
-> [论文链接](https://arxiv.org/abs/1706.03762)
-
-## 一句话总结
-
-这篇论文提出了 Transformer，用纯注意力机制替代 RNN/CNN，显著提升了序列建模的并行性与性能。
-
-## 研究问题
-
-传统序列模型（RNN、LSTM）难以并行，而且建模长距离依赖时效率较低。作者希望找到一种更高效的序列到序列建模方式。
-
-## 核心方法
-
-Transformer 的核心由以下模块组成：
-
-1. **Multi-Head Self-Attention**
-2. **Position-wise Feed-Forward Network**
-3. **Residual Connection + LayerNorm**
-4. **Positional Encoding**
-
-注意力计算的核心公式：
-
-$$
-\mathrm{Attention}(Q, K, V) = \mathrm{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V
-$$
-
-## 关键创新点
-
- 用 self-attention 替代循环结构
- 多头注意力让模型从不同子空间捕获关系
- 编码器/解码器结构具有极强的并行性
-
-## 实验结果
-
-在机器翻译任务上，Transformer 达到了当时非常强的结果，同时训练速度明显快于循环模型。
-
-## 优点
-
- 并行友好
- 长程依赖建模更直接
- 架构清晰，易扩展
-
-## 局限
-
- 位置编码不是天然内生的
- 注意力复杂度随序列长度平方增长
-
-## 我的理解 / 启发
-
-这篇论文最重要的意义不只是“效果更好”，而是把序列建模的主干从“递归”切换成了“基于关系的全局交互”，从而开启了后续大语言模型的主流范式。
--- a/docs/papers/index.md
+++ b/docs/papers/index.md
@@ -5,7 +5,6 @@
 ## 已发布

 - [Efficient Security Support for CXL Memory through Adaptive Incremental Offloaded (Re-)Encryption](aiore-cxl-security.md)
- [Attention Is All You Need](attention-is-all-you-need.md)

 ## 建议模板