Using a PyTorch-core codegen API

Below is a proposal for taking the codegen that pytorch/xla currently performs in-tree, and shifting that into pytorch core as a public codegen API that pytorch/xla can call. Sharing the design here for visibility! cc @ezyang @ailzhang 


**Background**

The below two resources are commonly used when implementing operators for external backends.

* https://pytorch.org/tutorials/advanced/extend_dispatcher.html
* People often copy paste pytorch/xla's codegen to get started registering their code. What does it do?
    * Generates registrations for all operators in the codebase
    * Generates CPU fallback registrations for operators you don’t have implemented
        * automatically performs xla→cpu→xla conversions
    * Catches schema changes to pytorch operators
    * In some cases, generates implementations of out operators in terms of their functional variants
    * Generates backend-specific autograd registrations for one or two kernels

Currently, you tell the XLA codegen which functions you’ve implemented by adding headers to `torch_xla/csrc/aten_xla_type.h`

```
// In aten_xla_type.h
  static at::Tensor acos(const at::Tensor& self);
  static at::Tensor& acos_(at::Tensor& self);
  static at::Tensor add(const at::Tensor& self, const at::Tensor& other, const at::Scalar& alpha);
  static at::Tensor add(const at::Tensor& self, const at::Scalar& other, const at::Scalar& alpha);
```



**Goals**

* XLA’s codegen is making up for perceived deficiencies in the APIs we provide, namely, it takes a lot of boilerplate to write all of the registrations a backend needs (and some of them, like CPU fallbacks, can be programatically generated). We want to close these deficiencies, and prevent people from having to copy pasting XLA’s codegen to get these facilities.
* pytorch/xla has its own parser for in-tree files like RegistrationDeclarations.yaml. Pytorch-core has its own tools for this, it would great to not have to duplicate it.



**The Pitch**

We will offer code generation in PyTorch itself, which you can use to generate this boilerplate.

First, you need to specify to the system which operators you actually support. This is specified as a list in a YAML file, say, xla_native_functions.yaml

```
backend: XLA
cpp_namespace: torch_xla
supported:  # can omit inplace/out if functional is supported
  - acos
  - add.Tensor
  - copy_  # no functional, inplace is implemented directly
  ...
autograd:
  - max_pool2d # override autograd instead of the forward
```

Then, as part of your build system, you run a codegen script from PyTorch on your YAML file: pytorch/tools/codegen/gen_backend_stubs.py xla_native_functions.yaml --output-dir /path/to/codegen/output/dir

This will generate boilerplate for you. Here are the files you will get:

```
// ------------------------------------------------
// XLANativeFunctions.h - stubs of operations you should implement

namespace torch_xla {

 Tensor acos(const Tensor & self);
Tensor add(const Tensor & self, const Scalar & other, const Scalar & alpha=1);
...

} // namespace torch_xla

// ------------------------------------------------
// RegisterXLA.cpp

Tensor wrapper_add_Tensor(
  const Tensor & self, const Tensor & other, const Scalar & alpha
) {
  return torch_xla::add(self, other, alpha);
}

// inplace and out variants are automatically generated, call into the functional variant

Tensor & wrapper_add__Tensor(
  Tensor & self, const Tensor & other, const Scalar & alpha
) {
  return torch_xla::copy_(self, torch_xla::add(self, other, alpha));
}

Tensor & wrapper_add_out(
  const Tensor & self, const Tensor & other, const Scalar & alpha, Tensor & out
) {
  return torch_xla::copy_(out, torch_xla::add(self, other, alpha));
}

TORCH_LIBRARY_IMPL(aten, XLA, m) {
  m.impl("add.Tensor", TORCH_FN(wrapper_add_Tensor));
  m.impl("add_.Tensor", TORCH_FN(wrapper_add__Tensor));
  m.impl("add_.out", TORCH_FN(wrapper_add_out));
  ...
}
```

Notable benefits:

* You don’t have to figure out what the correct C++ signature is, we generate those for you
* You don’t have write inplace/out versions of functions, they get generated for you

How we’re getting there

1. Nearly byte-for-byte compatible rewrite of the codegen that will live in PyTorch; none of the fancy new stuff, that will come later
 * Included in this rewrite: a yaml file that subsumes `aten_xla_type.h`
2. Start refactoring to take advantage of new features


**Appendix: Backwards compatibility**

Right now, the XLA codegen has logic to catch and error out when it sees BC-breaking schema changes to in-tree ops. As part of this change, BC-breaking schema changes that require fixups to external backend kernels will be caught by the compiler/linker, instead of by the codegen script. This is mostly because we’re codegen’ing the headers for each kernel for you, rather than having the backends write out the schema for the headers of each op themselves.


**Appendix: Fallbacks**

Fallbacks to CPU are needed for ops that external backends haven’t implemented yet. Some backend kernels also aren’t implemented for all valid inputs, and need to conditionally call into a CPU fallback.

Fallbacks to CPU are currently implemented in codegen. They will eventually be handled for you via generic implementations that would be provided by PyTorch:

```
// DispatchKey::FallbackCPU (maybe)
Tensor fallback_add(
  const Tensor & self, const Tensor & other, const Scalar & alpha
) {
  auto args = at::list_to_cpu({self, other}); // external backends override this
  auto result_cpu = at::add(args[0], args[1], alpha);
  return result_cpu.to(self.device());
}
```


**Current Status**

A WIP version of the xls-side change can be found here: https://github.com/pytorch/xla/pull/2869

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using a PyTorch-core codegen API #2871

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Using a PyTorch-core codegen API #2871

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions