Full Description
The ten-volume set LNCS 15016-15025 constitutes the refereed proceedings of the 33rd International Conference on Artificial Neural Networks and Machine Learning, ICANN 2024, held in Lugano, Switzerland, during September 17-20, 2024.
The 294 full papers and 16 short papers included in these proceedings were carefully reviewed and selected from 764 submissions. The papers cover the following topics:
Part I - theory of neural networks and machine learning; novel methods in machine learning; novel neural architectures; neural architecture search; self-organization; neural processes; novel architectures for computer vision; and fairness in machine learning.
Part II - computer vision: classification; computer vision: object detection; computer vision: security and adversarial attacks; computer vision: image enhancement; and computer vision: 3D methods.
Part III - computer vision: anomaly detection; computer vision: segmentation; computer vision: pose estimation and tracking; computer vision: video processing; computer vision: generative methods; and topics in computer vision.
Part IV - brain-inspired computing; cognitive and computational neuroscience; explainable artificial intelligence; robotics; and reinforcement learning.
Part V - graph neural networks; and large language models.
Part VI - multimodality; federated learning; and time series processing.
Part VII - speech processing; natural language processing; and language modeling.
Part VIII - biosignal processing in medicine and physiology; and medical image processing.
Part IX - human-computer interfaces; recommender systems; environment and climate; city planning; machine learning in engineering and industry; applications in finance; artificial intelligence in education; social network analysis; artificial intelligence and music; and software security.
Part X - workshop: AI in drug discovery; workshop: reservoir computing; special session: accuracy, stability, and robustness in deep neural networks; special session: neurorobotics; and special session: spiking neural networks.
Contents
.- Multimodality.
.- ARIF: An Adaptive Attention-Based Cross-Modal Representation Integration Framework.
.- BVRCC: Bootstrapping Video Retrieval via Cross-matching Correction.
.- CAW: Confidence-based Adaptive Weighted Model for Multi-modal Entity Linking.
.- Cross-Modal Attention Alignment Network with Auxiliary Text Description for zero-shot sketch-based image retrieva.
.- Exploring Interpretable Semantic Alignment for Multimodal Machine Translation.
.- Modal fusion-Enhanced two-stream hashing network for Cross modal Retrieval.
.- Text Visual Question Answering Based on Interactive Learning and Relationship Modeling.
.- Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment.
.- Federated Learning.
.- Addressing the Privacy and Complexity of Urban Traffic Flow Prediction with Federated Learning and Spatiotemporal Graph Convolutional Networks.
.- An Accuracy-Shaping Mechanism for Competitive Distributed Learning.
.- Federated Adversarial Learning for Robust Autonomous Landing Runway Detection.
.- FedInc: One-shot Federated Tuning for Collaborative Incident Recognition.
.- Layer-wised Sparsification Based on Hypernetwork for Distributed NN Training.
.- Security Assessment of Hierarchical Federated Deep Learning.
.- Time Series Processing.
.- ESSformer: Transformers with ESS Attention for Long-Term Series Forecasting.
.- Fusion of image representations for time series classification with deep learning.
.- HierNBeats: Hierarchical Neural Basis Expansion Analysis for Hierarchical Time Series Forecasting.
.- Learning Seasonal-Trend Representations and Conditional Heteroskedasticity for Time Series
Analysis.
.- One Process Spatiotemporal Learning of Transformers via Vcls Token for Multivariate Time Series Forecasting.
.- STformer: Spatio-Temporal Transformer for Multivariate Time Series Anomaly Detection.
.- TF-CL:Time Series Forcasting Based on Time-Frequency Domain Contrastive Learning.