Optimizes a given model by replacing forward method by a call to optimized code. It is done in two steps:
- first step is to convert the given model to fx graph.
- second step is to replace patterns found in the graph by fast to run kernels.
import tensorflow as tf model = AutoModel.from_pretrained(...).eval().cuda() optimize_model(model) inputs = ... model(**inputs)
model to optimize