In this paper, we introduce the Multimodal Multitask Dual Attention Network (MMA-Net), a novel neural architecture tailored for simultaneous multimodal and multitask learning. MMA-Net effectively processes audio and visual inputs through specialized encoders—AudioNet and VideoNet—to capture the distinct and complementary information present across different sensory modalities. A key element of our model is the implementation of Bidirectional Cross-Modality Attention, which refines the multimodal features by leveraging interdependencies between the modalities, enhancing the model's ability to handle complex datasets. This is complemented by the MultiTask Exchange Block, a sophisticated mechanism that dynamically integrates and refines task-specific features, promoting effective information exchange and synergy across multiple tasks. Experimental results demonstrate that MMA-Net achieves significant improvements over existing methods and sets a new benchmark in Cognitive Load Assessment.