Tensor Considered Harmful

Tensor Considered Harmful, by Alexander Rush

TL;DR: Despite its ubiquity in deep learning, Tensor is broken. It forces bad habits such as exposing private dimensions, broadcasting based on absolute position, and keeping type information in documentation. This post presents a proof-of-concept of an alternative approach, named tensors, with named dimensions. This change eliminates the need for indexing, dim arguments, einsum- style unpacking, and documentation-based coding. The prototype PyTorch library accompanying this blog post is available as namedtensor.

Thanks to Edward Z. Yang for pointing me to this "Considered Harmful" position paper.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

dimensions and units

My impression of this business of named tensors is that it's another form of the units of measure feature that comes up from time to time in PL design, either in statically or dynamically typed form. Though that in itself may say something about the nature of thought versus computation.

Truthfully, I don't grok tensors. I consider it a major gap in my mathematical toolkit. From what I've heard, I'm not alone in not grokking them, either. I've been stuck partway through Penrose's Road to Reality for several years, now, because I decided his overview of tensors just wasn't enough for me to move on past, and I was going to need to go off and get a feel for them elsewhere and come back (by now, I s'pose I'd have to start over from the beginning). So far, for all the explanations of tensors I've read, I haven't got a handle on them. Though I've got a new appreciation of the difference between reading a textbook and taking a well-taught class based on a textbook.

Synchronicity w/ The Morning Paper

Interesting that The Morning Paper picked a paper that highlights a similar issue:

Machine Learning Systems are stuck in a rut

For example:

Named dimensions improve readability by making it easier to determine how dimensions in the code correspond to the semantic dimensions described in, .e.g., a research paper.