I present a new view to the output activations of a block of layers in deep neural networks. In particular, we view the output activation of a linear operator, convolutional or fully connected, followed by a non-linearity, and followed by another linear operator as an approximate solution to a certain convex optimization problem. We show that replacing layers with optimization layers as solvers improve performance in different settings.