Hacker News
A polynomial autoencoder beats PCA on transformer embeddings
folderquestion
|next
[-]
Anisotropy and the cone ideas may explain why PCA underperforms, but it does not uniquely justify this particular quadratic decoder. The geometric story is not doing explanatory work beyond “data is nonlinear,” and the real substance is simply that second-order reconstruction empirically helps.
mentalgear
|next
|previous
[-]
By representing data as multivectors, translational and rotational symmetries are encoded natively which allows them to handle geometric hierarchies with massive efficiency gains (reports of up to 78x speedups and 200x parameter reductions) compared to standard Transformers.
> A novel sequence architecture is introduced, Versor, which uses Conformal Geometric Algebra (CGA) in place of traditional linear operations to achieve structural generalization and significant performance improvements on a variety of tasks, while offering improved interpretability and efficiency. By embedding states in the manifold and evolving them via geometric transformations (rotors), Versor natively represents -equivariant relationships without requiring explicit structural encoding. Versor is validated on chaotic N-body dynamics, topological reasoning, and standard multimodal benchmarks (CIFAR-10, WikiText-103), consistently outperforming Transformers, Graph Networks, and geometric baselines (GATr, EGNN).
yobbo
|next
|previous
[-]
pleshkov
|next
|previous
[-]
Devilstro
|root
|parent
|next
[-]
yorwba
|root
|parent
|next
|previous
[-]
electroglyph
|next
|previous
[-]
magicalhippo
|next
|previous
[-]
But for RAG that might be too much work per vector?
teiferer
|previous
[-]
whywhywhywhy
|root
|parent
|next
[-]
None of this stuff is as difficult to understand as people claim it is once you work with it.
magicalhippo
|root
|parent
|next
|previous
[-]
That said I do think it's a good habit to either write out abbreviations in full or link to say Wikipedia, eg for PCA[1]. It's a well-known tool but still if you come from a slightly different field it might not ring a bell.
[1]: https://en.wikipedia.org/wiki/Principal_component_analysis