DualPipe is a bidirectional pipeline parallelism algorithm that optimizes computation-communication overlap in neural networks by achieving full overlap of forward and backward phases. The solution, presented in the DeepSeek-V3 Technical Report, reduces pipeline bubbles and requires implementation of custom overlapped forward-backward methods for specific modules.
https://github.com/deepseek-ai/DualPipe
#machinelearning #parallelism #algorithm #pytorch #deepseek