Abstract: The Transformer has achieved impressive performance in the multi-channel speech enhancement field; however, it struggles to capture local features, which leads to the loss of speech details.
Abstract: Fine-tuning has become a norm to achieve state-of-the-art performance when employing pre-trained networks like foundation models. These models are typically pre-trained on large-scale ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results