Not OP. This question is being reposted to preserve technical content removed from elsewhere. Feel free to add your own answers/discussion.
I have two 2-D arrays with the same first axis dimensions. In python, I would like to convolve the two matrices along the second axis only. I would like to get C below without computing the convolution along the first axis as well.
import numpy as np
import scipy.signal as sg
M, N, P = 4, 10, 20
A = np.random.randn(M, N)
B = np.random.randn(M, P)
C = sg.convolve(A, B, 'full')[(2*M-1)/2]
Is there a fast way?
Original answer: (credit @Mercury)
Late answer, but worth posting for reference. Quoting from comments of the OP:
Naive / Straightforward Approach
import numpy as np import scipy.signal as sg M, N, P = 4, 10, 20 A = np.random.randn(M, N) # (4, 10) B = np.random.randn(M, P) # (4, 20) C = np.vstack([sg.convolve(a, b, 'full') for a, b in zip(A, B)]) >>> C.shape (4, 29)
Each row in A is convolved with each respective row in B, essentially convolving M 1D arrays/vectors.
No Loop + CUDA Supported Version It is possible to replicate this operation by using PyTorch’s F.conv1d. We have to imagine A as a 4-channel, 1D signal of length 10. We wish to convolve each channel in A with a specific kernel of length 20. This is a special case called a depthwise convolution, often used in deep learning.
Note that torch’s conv is implemented as cross-correlation, so we need to flip B in advance to do actual convolution.
import torch import torch.nn.functional as F @torch.no_grad() def torch_conv(A, B): M, N, P = A.shape[0], A.shape[1], B.shape[1] C = F.conv1d(A, B[:, None, :], bias=None, stride=1, groups=M, padding=N+(P-1)//2) return C.numpy() # Convert A and B to torch tensors + flip B X = torch.from_numpy(A) # (4, 10) W = torch.from_numpy(np.fliplr(B).copy()) # (4, 20) # Do grouped conv and get np array Y = torch_conv(X, W) >>> Y.shape (4, 29) >>> np.allclose(C, Y) True
Advantages of using a depthwise convolution with torch:
No loops! The above solution can also run on CUDA/GPU, which can really speed things up if A and B are very large matrices. (From OP’s comment, this seems to be the case: A is 10GB in size.) Disadvantages:
Overhead of converting from array to tensor (should be negligible) Need to flip B once before the operation>>