Maxout

Maxout activation function.

Select the maximum across multiple linear functions, allowing the network to learn piecewise linear convex functions.

activations_plus.Maxout.__init__(self, num_pieces: int) None

Initialize the Maxout activation module.

Initialize the number of pieces into which the input is split for the Maxout operation.

Parameters:

num_pieces – Number of pieces into which the input is divided for the Maxout operation.

activations_plus.Maxout.forward(self, x: Tensor) Tensor

Reshape the input tensor and compute the maximum along the split dimension.

Reshape the input tensor so that the last dimension is divided into num_pieces, then compute and return the maximum values along the new axis.

Parameters:

x – A tensor of arbitrary shape where the last dimension must be divisible by self.num_pieces.

Returns:

A tensor containing the maximum values along the split dimension of the reshaped input tensor. The resulting shape will match all but the last dimension of the input tensor.

Reference Paper: Maxout Activation Function

Mathematical Explanation:

The Maxout activation function is defined as:

\[f(x) = \max_{i \in [1, k]} (x \cdot W_i + b_i)\]

where \(W_i\) and \(b_i\) are learnable parameters, and \(k\) is the number of linear pieces.

Example Usage:

import torch
from activations_plus.maxout import Maxout

activation = Maxout(num_pieces=2)
x = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
y = activation(x)
print("Maxout Output:", y)