At the intersection of classical signal processing and machine learning, AykutKocLab has made a significant contribution by introducing an advanced and efficient transformer model (FrFNet) that utilizes the fractional Fourier transform (FrFT) to mix the tokens of the transformer. The FrFNet has been described in the paper “Fractional Fourier Transform Meets Transformer Encoder” published in IEEE Signal Processing Letters.
One of the main mechanisms of neural networks lies in the flow of information across layers and in their ability to make vast amounts of interconnections during that information flow. To this end, FrFNet offers a way of establishing connections across layers capable of implementing infinitely many transformations for token-mixing by bringing the powerful tool of FrFT from classical signal processing to machine learning. FrFNet increases model accuracy without paying additional computational costs.
We believe that introducing FrFT to transformer encoders stimulates further developments where signal processing meets deep neural networks.
The paper can be accessed [here].
Özet:
Utilizing signal processing tools in deep learning models has been drawing increasing attention. Fourier transform (FT), one of the most popular signal processing tools, is employed in many deep learning models. Transformer-based sequential input processing models have also started to make use of FT. In the existing FNet model, it is shown that replacing the attention layer, which is computationally expensive, with FT accelerates model training without sacrificing task performances significantly. We further improve this idea by introducing the fractional Fourier transform (FrFT) into the transformer architecture. As a parameterized transform with a fraction order, FrFT provides an opportunity to access any intermediate domain between time and frequency and find better-performing transformation domains. According to the needs of downstream tasks, a suitable fractional order can be used in our proposed model FrFNet. Our experiments on downstream tasks show that FrFNet leads to performance improvements over the ordinary FNet.