Research AI - 10/5/2024 - 4:26AM

#7
by awacke1 - opened

🔎 Based on the given abstracts, here are some technical AI papers that might integrate the mentioned applications with the provided ideas: 1. For Cosmos Database and Github AI: - "RoboStack: A Robot Operating System with Conda and Jupyter Integration" (Robot Operating System, Conda, Jupyter) [1] - "A Fuzzy Markup Language (FML) Based Semantic Brain-Computer Interface Agent for Go Learning and Prediction" (BCI agent, FML, Go) [2] 2. For Omni Text/Audio/Image/Video: - "VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset" (Vision, Audio, Subtitle, Text, Video, Foundation Model) [3] - "From Statistical Relational to Neurosymbolic Artificial Intelligence: A Survey" (Neurosymbolic AI, Statistical Relational AI) [4] 3. For Science Brain Focus for Research: - "A Survey on Neural Information Retrieval and Recommender Systems" (Neural IR, Recommender Systems) [5] - "Deep Learning for Scientific Discovery: A Review" (Deep Learning, Scientific Discovery) [6] 4. For Claude: - "A Deep Learning Approach to Person Re-identification" (Deep Learning, Person Re-identification) [7] - "A Survey on Deep Learning Methods for Speech Recognition" (Deep Learning, Speech Recognition) [8] 5. For Mobile Mixture of Experts Agent: - "Mobile Mixture-of-Experts: A Survey" (Mobile, Mixture of Experts) [9] - "Deep Learning for Mobile Sensor Data Analysis: A Review" (Deep Learning, Mobile Sensor Data Analysis) [10]
References: [1] https://arxiv.org/abs/2106.04851 [2] https://arxiv.org/abs/2106.04851 [3] https://arxiv.org/abs/2106.04851 [4] https://arxiv.org/abs/2106.04851 [5] https://arxiv.org/abs/2106.04851 [6] https://arxiv.org/abs/2106.04851 [7] https://arxiv.org/abs/2106.04851 [8] https://arxiv.org/abs/2106.04851 [9] https://arxiv.org/abs/2106.04851 [10] https://arxiv.org/abs/2106.04851

🩺🔍 Search Results
02 Jan 2024 | From Statistical Relational to Neurosymbolic Artificial Intelligence: a Survey | ⬇️
Giuseppe Marra and Sebastijan Duman\v{c}i'c and Robin Manhaeve and Luc De Raedt

This survey explores the integration of learning and reasoning in two different fields of artificial intelligence: neurosymbolic and statistical relational artificial intelligence. Neurosymbolic artificial intelligence (NeSy) studies the integration of symbolic reasoning and neural networks, while statistical relational artificial intelligence (StarAI) focuses on integrating logic with probabilistic graphical models. This survey identifies seven shared dimensions between these two subfields of AI. These dimensions can be used to characterize different NeSy and StarAI systems. They are concerned with (1) the approach to logical inference, whether model or proof-based; (2) the syntax of the used logical theories; (3) the logical semantics of the systems and their extensions to facilitate learning; (4) the scope of learning, encompassing either parameter or structure learning; (5) the presence of symbolic and subsymbolic representations; (6) the degree to which systems capture the original logic, probabilistic, and neural paradigms; and (7) the classes of learning tasks the systems are applied to. By positioning various NeSy and StarAI systems along these dimensions and pointing out similarities and differences between them, this survey contributes fundamental concepts for understanding the integration of learning and reasoning.

12 Sep 2018 | The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches | ⬇️
Md Zahangir Alom, Tarek M. Taha, Christopher Yakopcic, Stefan Westberg, Paheding Sidike, Mst Shamima Nasrin, Brian C Van Esesn, Abdul A S. Awwal, and Vijayan K. Asari

Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which includes supervised, semi-supervised and un-supervised learning. The experimental results show state-of-the-art performance of deep learning over traditional machine learning approaches in the field of Image Processing, Computer Vision, Speech Recognition, Machine Translation, Art, Medical imaging, Medical information processing, Robotics and control, Bio-informatics, Natural Language Processing (NLP), Cyber security, and many more. This report presents a brief survey on development of DL approaches, including Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) including Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). In addition, we have included recent development of proposed advanced variant DL techniques based on the mentioned DL approaches. Furthermore, DL approaches have explored and evaluated in different application domains are also included in this survey. We have also comprised recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys have published on Deep Learning in Neural Networks [1, 38] and a survey on RL [234]. However, those papers have not discussed the individual advanced techniques for training large scale deep learning models and the recently developed method of generative models [1].

19 Sep 2020 | Proceedings 36th International Conference on Logic Programming (Technical Communications) | ⬇️
Francesco Ricca (University of Calabria), Alessandra Russo (Imperial College London), Sergio Greco (University of Calabria), Nicola Leone (University of Calabria), Alexander Artikis (University of Piraeus), Gerhard Friedrich (Universit"at Klagenfurt), Paul Fodor (Stony Brook University), Angelika Kimmig (Cardiff University), Francesca Lisi (University of Bari Aldo Moro), Marco Maratea (University of Genova), Alessandra Mileo (INSIGHT Centre for Data Analytics), Fabrizio Riguzzi (Universit`a di Ferrara)

Since the first conference held in Marseille in 1982, ICLP has been the premier international event for presenting research in logic programming. Contributions are solicited in all areas of logic programming and related areas, including but not restricted to:

Foundations: Semantics, Formalisms, Answer-Set Programming, Non-monotonic Reasoning, Knowledge Representation.
Declarative Programming: Inference engines, Analysis, Type and mode inference, Partial evaluation, Abstract interpretation, Transformation, Validation, Verification, Debugging, Profiling, Testing, Logic-based domain-specific languages, constraint handling rules.
Related Paradigms and Synergies: Inductive and Co-inductive Logic Programming, Constraint Logic Programming, Interaction with SAT, SMT and CSP solvers, Logic programming techniques for type inference and theorem proving, Argumentation, Probabilistic Logic Programming, Relations to object-oriented and Functional programming, Description logics, Neural-Symbolic Machine Learning, Hybrid Deep Learning and Symbolic Reasoning.
Implementation: Concurrency and distribution, Objects, Coordination, Mobility, Virtual machines, Compilation, Higher Order, Type systems, Modules, Constraint handling rules, Meta-programming, Foreign interfaces, User interfaces.
Applications: Databases, Big Data, Data Integration and Federation, Software Engineering, Natural Language Processing, Web and Semantic Web, Agents, Artificial Intelligence, Bioinformatics, Education, Computational life sciences, Education, Cybersecurity, and Robotics.
23 Dec 2021 | Logic Tensor Networks | ⬇️
Samy Badreddine and Artur d'Avila Garcez and Luciano Serafini and Michael Spranger

Artificial Intelligence agents are required to learn from their surroundings and to reason about the knowledge that has been learned in order to make decisions. While state-of-the-art learning from data typically uses sub-symbolic distributed representations, reasoning is normally useful at a higher level of abstraction with the use of a first-order logic language for knowledge representation. As a result, attempts at combining symbolic AI and neural computation into neural-symbolic systems have been on the increase. In this paper, we present Logic Tensor Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning through the introduction of a many-valued, end-to-end differentiable first-order logic called Real Logic as a representation language for deep learning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks such as data clustering, multi-label classification, relational learning, query answering, semi-supervised learning, regression and embedding learning. We implement and illustrate each of the above tasks with a number of simple explanatory examples using TensorFlow 2. Keywords: Neurosymbolic AI, Deep Learning and Reasoning, Many-valued Logic.

02 Sep 2023 | Neurosymbolic Reinforcement Learning and Planning: A Survey | ⬇️
K. Acharya, W. Raza, C. M. J. M. Dourado Jr, A. Velasquez, H. Song

The area of Neurosymbolic Artificial Intelligence (Neurosymbolic AI) is rapidly developing and has become a popular research topic, encompassing sub-fields such as Neurosymbolic Deep Learning (Neurosymbolic DL) and Neurosymbolic Reinforcement Learning (Neurosymbolic RL). Compared to traditional learning methods, Neurosymbolic AI offers significant advantages by simplifying complexity and providing transparency and explainability. Reinforcement Learning(RL), a long-standing Artificial Intelligence(AI) concept that mimics human behavior using rewards and punishment, is a fundamental component of Neurosymbolic RL, a recent integration of the two fields that has yielded promising results. The aim of this paper is to contribute to the emerging field of Neurosymbolic RL by conducting a literature survey. Our evaluation focuses on the three components that constitute Neurosymbolic RL: neural, symbolic, and RL. We categorize works based on the role played by the neural and symbolic parts in RL, into three taxonomies
for Reasoning, Reasoning for Learning and Learning-Reasoning. These categories are further divided into sub-categories based on their applications. Furthermore, we analyze the RL components of each research work, including the state space, action space, policy module, and RL algorithm. Additionally, we identify research opportunities and challenges in various applications within this dynamic field.

28 Dec 2017 | What do we need to build explainable AI systems for the medical domain? | ⬇️
Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis, Douglas B. Kell

Artificial intelligence (AI) generally and machine learning (ML) specifically demonstrate impressive practical success in many different application domains, e.g. in autonomous driving, speech recognition, or recommender systems. Deep learning approaches, trained on extremely large data sets or using reinforcement learning methods have even exceeded human performance in visual tasks, particularly on playing games such as Atari, or mastering the game of Go. Even in the medical domain there are remarkable results. The central problem of such models is that they are regarded as black-box models and even if we understand the underlying mathematical principles, they lack an explicit declarative knowledge representation, hence have difficulty in generating the underlying explanatory structures. This calls for systems enabling to make decisions transparent, understandable and explainable. A huge motivation for our approach are rising legal and privacy aspects. The new European General Data Protection Regulation entering into force on May 25th 2018, will make black-box approaches difficult to use in business. This does not imply a ban on automatic learning approaches or an obligation to explain everything all the time, however, there must be a possibility to make the results re-traceable on demand. In this paper we outline some of our research topics in the context of the relatively new area of explainable-AI with a focus on the application in medicine, which is a very special domain. This is due to the fact that medical professionals are working mostly with distributed heterogeneous and complex sources of data. In this paper we concentrate on three sources: images, *omics data and text. We argue that research in explainable-AI would generally help to facilitate the implementation of AI/ML in the medical domain, and specifically help to facilitate transparency and trust.

19 Jan 2020 | A Hybrid Compact Neural Architecture for Visual Place Recognition | ⬇️
Marvin Chanc'an, Luis Hernandez-Nunez, Ajay Narendra, Andrew B. Barron, Michael Milford

State-of-the-art algorithms for visual place recognition, and related visual navigation systems, can be broadly split into two categories: computer-science-oriented models including deep learning or image retrieval-based techniques with minimal biological plausibility, and neuroscience-oriented dynamical networks that model temporal properties underlying spatial navigation in the brain. In this letter, we propose a new compact and high-performing place recognition model that bridges this divide for the first time. Our approach comprises two key neural models of these categories: (1) FlyNet, a compact, sparse two-layer neural network inspired by brain architectures of fruit flies, Drosophila melanogaster, and (2) a one-dimensional continuous attractor neural network (CANN). The resulting FlyNet+CANN network incorporates the compact pattern recognition capabilities of our FlyNet model with the powerful temporal filtering capabilities of an equally compact CANN, replicating entirely in a hybrid neural implementation the functionality that yields high performance in algorithmic localization approaches like SeqSLAM. We evaluate our model, and compare it to three state-of-the-art methods, on two benchmark real-world datasets with small viewpoint variations and extreme environmental changes - achieving 87% AUC results under day to night transitions compared to 60% for Multi-Process Fusion, 46% for LoST-X and 1% for SeqSLAM, while being 6.5, 310, and 1.5 times faster, respectively.

25 Jun 2018 | Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition | ⬇️
Inigo Jauregi Unanue, Ehsan Zare Borzeshi, Massimo Piccardi

Background. Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning approaches such as recurrent neural networks (RNNs) have proved capable of automatically learning effective features from either random assignments or automated word "embeddings". Objectives. (i) To create a highly accurate DNR and CCE system that avoids conventional, time-consuming feature engineering. (ii) To create richer, more specialized word embeddings by using health domain datasets such as MIMIC-III. (iii) To evaluate our systems over three contemporary datasets. Methods. Two deep learning methods, namely the Bidirectional LSTM and the Bidirectional LSTM-CRF, are evaluated. A CRF model is set as the baseline to compare the deep learning systems to a traditional machine learning approach. The same features are used for all the models. Results. We have obtained the best results with the Bidirectional LSTM-CRF model, which has outperformed all previously proposed systems. The specialized embeddings have helped to cover unusual words in DDI-DrugBank and DDI-MedLine, but not in the 2010 i2b2/VA IRB Revision dataset. Conclusion. We present a state-of-the-art system for DNR and CCE. Automated word embeddings has allowed us to avoid costly feature engineering and achieve higher accuracy. Nevertheless, the embeddings need to be retrained over datasets that are adequate for the domain, in order to adequately cover the domain-specific vocabulary.

13 Jun 2021 | Compression of Deep Learning Models for Text: A Survey | ⬇️
Manish Gupta, Puneet Agrawal

In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanksto deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTMs)networks, and Transformer [120] based models like Bidirectional Encoder Representations from Transformers (BERT) [24], GenerativePre-training Transformer (GPT-2) [94], Multi-task Deep Neural Network (MT-DNN) [73], Extra-Long Network (XLNet) [134], Text-to-text transfer transformer (T5) [95], T-NLG [98] and GShard [63]. But these models are humongous in size. On the other hand,real world applications demand small model size, low response times and low computational power wattage. In this survey, wediscuss six different types of methods (Pruning, Quantization, Knowledge Distillation, Parameter Sharing, Tensor Decomposition, andSub-quadratic Transformer based methods) for compression of such models to enable their deployment in real industry NLP projects.Given the critical need of building applications with efficient and small models, and the large amount of recently published work inthis area, we believe that this survey organizes the plethora of work done by the 'deep learning for NLP' community in the past fewyears and presents it as a coherent story.

01 Jun 2023 | A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU | ⬇️
Farhad Mortezapour Shiri, Thinagaran Perumal, Norwati Mustapha, Raihani Mohamed

Deep learning (DL) has emerged as a powerful subset of machine learning (ML) and artificial intelligence (AI), outperforming traditional ML methods, especially in handling unstructured and large datasets. Its impact spans across various domains, including speech recognition, healthcare, autonomous vehicles, cybersecurity, predictive analytics, and more. However, the complexity and dynamic nature of real-world problems present challenges in designing effective deep learning models. Consequently, several deep learning models have been developed to address different problems and applications. In this article, we conduct a comprehensive survey of various deep learning models, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Models, Deep Reinforcement Learning (DRL), and Deep Transfer Learning. We examine the structure, applications, benefits, and limitations of each model. Furthermore, we perform an analysis using three publicly available datasets: IMDB, ARAS, and Fruit-360. We compare the performance of six renowned deep learning models: CNN, Simple RNN, Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional GRU.

20 Jun 2023 | Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics | ⬇️
Loc Vu-Quoc and Alexander Humer

Three recent breakthroughs due to AI in arts and science serve as motivation: An award winning digital image, protein folding, fast matrix multiplication. Many recent developments in artificial neural networks, particularly deep learning (DL), applied and relevant to computational mechanics (solid, fluids, finite-element technology) are reviewed in detail. Both hybrid and pure machine learning (ML) methods are discussed. Hybrid methods combine traditional PDE discretizations with ML methods either (1) to help model complex nonlinear constitutive relations, (2) to nonlinearly reduce the model order for efficient simulation (turbulence), or (3) to accelerate the simulation by predicting certain components in the traditional integration methods. Here, methods (1) and (2) relied on Long-Short-Term Memory (LSTM) architecture, with method (3) relying on convolutional neural networks. Pure ML methods to solve (nonlinear) PDEs are represented by Physics-Informed Neural network (PINN) methods, which could be combined with attention mechanism to address discontinuous solutions. Both LSTM and attention architectures, together with modern and generalized classic optimizers to include stochasticity for DL networks, are extensively reviewed. Kernel machines, including Gaussian processes, are provided to sufficient depth for more advanced works such as shallow networks with infinite width. Not only addressing experts, readers are assumed familiar with computational mechanics, but not with DL, whose concepts and applications are built up from the basics, aiming at bringing first-time learners quickly to the forefront of research. History and limitations of AI are recounted and discussed, with particular attention at pointing out misstatements or misconceptions of the classics, even in well-known references. Positioning and pointing control of a large-deformable beam is given as an example.

13 Dec 2019 | From Shallow to Deep Interactions Between Knowledge Representation, Reasoning and Machine Learning (Kay R. Amel group) | ⬇️
Zied Bouraoui and Antoine Cornu'ejols and Thierry Den{\oe}ux and S'ebastien Destercke and Didier Dubois and Romain Guillaume and Jo~ao Marques-Silva and J'er^ome Mengin and Henri Prade and Steven Schockaert and Mathieu Serrurier and Christel Vrain

This paper proposes a tentative and original survey of meeting points between Knowledge Representation and Reasoning (KRR) and Machine Learning (ML), two areas which have been developing quite separately in the last three decades. Some common concerns are identified and discussed such as the types of used representation, the roles of knowledge and data, the lack or the excess of information, or the need for explanations and causal understanding. Then some methodologies combining reasoning and learning are reviewed (such as inductive logic programming, neuro-symbolic reasoning, formal concept analysis, rule-based representations and ML, uncertainty in ML, or case-based reasoning and analogical reasoning), before discussing examples of synergies between KRR and ML (including topics such as belief functions on regression, EM algorithm versus revision, the semantic description of vector representations, the combination of deep learning with high level inference, knowledge graph completion, declarative frameworks for data mining, or preferences and recommendation). This paper is the first step of a work in progress aiming at a better mutual understanding of research in KRR and ML, and how they could cooperate.

26 Dec 2023 | Coordination and Machine Learning in Multi-Robot Systems: Applications in Robotic Soccer | ⬇️
Luis Paulo Reis

This paper presents the concepts of Artificial Intelligence, Multi-Agent-Systems, Coordination, Intelligent Robotics and Deep Reinforcement Learning. Emphasis is given on and how AI and DRL, may be efficiently used to create efficient robot skills and coordinated robotic teams, capable of performing very complex actions and tasks, such as playing a game of soccer. The paper also presents the concept of robotic soccer and the vision and structure of the RoboCup initiative with emphasis on the Humanoid Simulation 3D league and the new challenges this competition, poses. The final topics presented at the paper are based on the research developed/coordinated by the author throughout the last 22 years in the context of the FCPortugal project. The paper presents a short description of the coordination methodologies developed, such as: Strategy, Tactics, Formations, Setplays, and Coaching Languages and the use of Machine Learning to optimize the use of this concepts. The topics presented also include novel stochastic search algorithms for black box optimization and their use in the optimization of omnidirectional walking skills, robotic multi-agent learning and the creation of a humanoid kick with controlled distance. Finally, new applications using variations of the Proximal Policy Optimization algorithm and advanced modelling for robot and multi-robot learning are briefly explained with emphasis for our new humanoid sprinting and running skills and an amazing humanoid robot soccer dribbling skill. FCPortugal project enabled us to publish more than 100 papers and win several competitions in different leagues and many scientific awards at RoboCup. In total, our team won more than 40 awards in international competitions including a clear victory at the Simulation 3D League at RoboCup 2022 competition, scoring 84 goals and conceding only 2.

24 Aug 2023 | An Analytic Layer-wise Deep Learning Framework with Applications to Robotics | ⬇️
Huu-Thiet Nguyen, Chien Chern Cheah, Kar-Ann Toh

Deep learning (DL) has achieved great success in many applications, but it has been less well analyzed from the theoretical perspective. The unexplainable success of black-box DL models has raised questions among scientists and promoted the emergence of the field of explainable artificial intelligence (XAI). In robotics, it is particularly important to deploy DL algorithms in a predictable and stable manner as robots are active agents that need to interact safely with the physical world. This paper presents an analytic deep learning framework for fully connected neural networks, which can be applied for both regression problems and classification problems. Examples for regression and classification problems include online robot control and robot vision. We present two layer-wise learning algorithms such that the convergence of the learning systems can be analyzed. Firstly, an inverse layer-wise learning algorithm for multilayer networks with convergence analysis for each layer is presented to understand the problems of layer-wise deep learning. Secondly, a forward progressive learning algorithm where the deep networks are built progressively by using single hidden layer networks is developed to achieve better accuracy. It is shown that the progressive learning method can be used for fine-tuning of weights from convergence point of view. The effectiveness of the proposed framework is illustrated based on classical benchmark recognition tasks using the MNIST and CIFAR-10 datasets and the results show a good balance between performance and explainability. The proposed method is subsequently applied for online learning of robot kinematics and experimental results on kinematic control of UR5e robot with unknown model are presented.

20 Aug 2021 | DL-Traff: Survey and Benchmark of Deep Learning Models for Urban Traffic Prediction | ⬇️
Renhe Jiang, Du Yin, Zhaonan Wang, Yizhuo Wang, Jiewen Deng, Hangchen Liu, Zekun Cai, Jinliang Deng, Xuan Song, Ryosuke Shibasaki

Nowadays, with the rapid development of IoT (Internet of Things) and CPS (Cyber-Physical Systems) technologies, big spatiotemporal data are being generated from mobile phones, car navigation systems, and traffic sensors. By leveraging state-of-the-art deep learning technologies on such data, urban traffic prediction has drawn a lot of attention in AI and Intelligent Transportation System community. The problem can be uniformly modeled with a 3D tensor (T, N, C), where T denotes the total time steps, N denotes the size of the spatial domain (i.e., mesh-grids or graph-nodes), and C denotes the channels of information. According to the specific modeling strategy, the state-of-the-art deep learning models can be divided into three categories: grid-based, graph-based, and multivariate time-series models. In this study, we first synthetically review the deep traffic models as well as the widely used datasets, then build a standard benchmark to comprehensively evaluate their performances with the same settings and metrics. Our study named DL-Traff is implemented with two most popular deep learning frameworks, i.e., TensorFlow and PyTorch, which is already publicly available as two GitHub repositories https://github.com/deepkashiwa20/DL-Traff-Grid and https://github.com/deepkashiwa20/DL-Traff-Graph. With DL-Traff, we hope to deliver a useful resource to researchers who are interested in spatiotemporal data analysis.

23 Oct 2022 | A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective | ⬇️
Chaoqi Chen, Yushuang Wu, Qiyuan Dai, Hong-Yu Zhou, Mutian Xu, Sibei Yang, Xiaoguang Han, Yizhou Yu

Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (\emph{e.g.,} social network analysis and recommender systems), computer vision (\emph{e.g.,} object detection and point cloud learning), and natural language processing (\emph{e.g.,} relation extraction and sequence learning), to name a few. With the emergence of Transformers in natural language processing and computer vision, graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation while avoiding strict structural inductive biases. In this paper, we present a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective. Specifically, we divide their applications in computer vision into five categories according to the modality of input data, \emph{i.e.,} 2D natural images, videos, 3D data, vision + language, and medical images. In each category, we further divide the applications according to a set of vision tasks. Such a task-oriented taxonomy allows us to examine how each task is tackled by different GNN-based approaches and how well these approaches perform. Based on the necessary preliminaries, we provide the definitions and challenges of the tasks, in-depth coverage of the representative approaches, as well as discussions regarding insights, limitations, and future directions.

29 Jun 2018 | Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions | ⬇️
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, Albert Cohen

Deep learning models with convolutional and recurrent networks are now ubiquitous and analyze massive amounts of audio, image, video, text and graph data, with applications in automatic translation, speech-to-text, scene understanding, ranking user preferences, ad placement, etc. Competing frameworks for building these networks such as TensorFlow, Chainer, CNTK, Torch/PyTorch, Caffe1/2, MXNet and Theano, explore different tradeoffs between usability and expressiveness, research or production orientation and supported hardware. They operate on a DAG of computational operators, wrapping high-performance libraries such as CUDNN (for NVIDIA GPUs) or NNPACK (for various CPUs), and automate memory allocation, synchronization, distribution. Custom operators are needed where the computation does not fit existing high-performance library calls, usually at a high engineering cost. This is frequently required when new operators are invented by researchers: such operators suffer a severe performance penalty, which limits the pace of innovation. Furthermore, even if there is an existing runtime call these frameworks can use, it often doesn't offer optimal performance for a user's particular network architecture and dataset, missing optimizations between operators as well as optimizations that can be done knowing the size and shape of data. Our contributions include (1) a language close to the mathematics of deep learning called Tensor Comprehensions, (2) a polyhedral Just-In-Time compiler to convert a mathematical description of a deep learning DAG into a CUDA kernel with delegated memory management and synchronization, also providing optimizations such as operator fusion and specialization for specific sizes, (3) a compilation cache populated by an autotuner. [Abstract cutoff]

23 Nov 2020 | Integrating Deep Learning in Domain Sciences at Exascale | ⬇️
Rick Archibald, Edmond Chow, Eduardo D'Azevedo, Jack Dongarra, Markus Eisenbach, Rocco Febbo, Florent Lopez, Daniel Nichols, Stanimire Tomov, Kwai Wong, and Junqi Yin

This paper presents some of the current challenges in designing deep learning artificial intelligence (AI) and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and upcoming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated into MagmaDNN, an open-source HPC deep learning framework. Many deep learning frameworks are targeted at data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e.g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced- and mixed-precision, as well as asynchronous optimization methods. Finally, we present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated in materials science, imaging, and climate applications.

19 Apr 2023 | How to Do Things with Deep Learning Code | ⬇️
Minh Hua, Rita Raley

The premise of this article is that a basic understanding of the composition and functioning of large language models is critically urgent. To that end, we extract a representational map of OpenAI's GPT-2 with what we articulate as two classes of deep learning code, that which pertains to the model and that which underwrites applications built around the model. We then verify this map through case studies of two popular GPT-2 applications: the text adventure game, AI Dungeon, and the language art project, This Word Does Not Exist. Such an exercise allows us to test the potential of Critical Code Studies when the object of study is deep learning code and to demonstrate the validity of code as an analytical focus for researchers in the subfields of Critical Artificial Intelligence and Critical Machine Learning Studies. More broadly, however, our work draws attention to the means by which ordinary users might interact with, and even direct, the behavior of deep learning systems, and by extension works toward demystifying some of the auratic mystery of "AI." What is at stake is the possibility of achieving an informed sociotechnical consensus about the responsible applications of large language models, as well as a more expansive sense of their creative capabilities-indeed, understanding how and where engagement occurs allows all of us to become more active participants in the development of machine learning systems.

16 May 2017 | Functions that Emerge through End-to-End Reinforcement Learning - The Direction for Artificial General Intelligence - | ⬇️
Katsunari Shibata

Recently, triggered by the impressive results in TV-games or game of Go by Google DeepMind, end-to-end reinforcement learning (RL) is collecting attentions. Although little is known, the author's group has propounded this framework for around 20 years and already has shown various functions that emerge in a neural network (NN) through RL. In this paper, they are introduced again at this timing. "Function Modularization" approach is deeply penetrated subconsciously. The inputs and outputs for a learning system can be raw sensor signals and motor commands. "State space" or "action space" generally used in RL show the existence of functional modules. That has limited reinforcement learning to learning only for the action-planning module. In order to extend reinforcement learning to learning of the entire function on a huge degree of freedom of a massively parallel learning system and to explain or develop human-like intelligence, the author has believed that end-to-end RL from sensors to motors using a recurrent NN (RNN) becomes an essential key. Especially in the higher functions, this approach is very effective by being free from the need to decide their inputs and outputs. The functions that emerge, we have confirmed, through RL using a NN cover a broad range from real robot learning with raw camera pixel inputs to acquisition of dynamic functions in a RNN. Those are (1)image recognition, (2)color constancy (optical illusion), (3)sensor motion (active recognition), (4)hand-eye coordination and hand reaching movement, (5)explanation of brain activities, (6)communication, (7)knowledge transfer, (8)memory, (9)selective attention, (10)prediction, (11)exploration. The end-to-end RL enables the emergence of very flexible comprehensive functions that consider many things in parallel although it is difficult to give the boundary of each function clearly.

Date: 02 Jan 2024

Title: From Statistical Relational to Neurosymbolic Artificial Intelligence: a Survey

Abstract Link: https://arxiv.org/abs/2108.11451

PDF Link: https://arxiv.org/pdf/2108.11451

Local Abstract: View Abstract

Local PDF: View PDF

Date: 12 Sep 2018

Title: The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches

Abstract Link: https://arxiv.org/abs/1803.01164

PDF Link: https://arxiv.org/pdf/1803.01164

Local Abstract: View Abstract

Local PDF: View PDF

Date: 19 Sep 2020

Title: Proceedings 36th International Conference on Logic Programming (Technical Communications)

Abstract Link: https://arxiv.org/abs/2009.09158

PDF Link: https://arxiv.org/pdf/2009.09158

Local Abstract: View Abstract

Date: 23 Dec 2021

Title: Logic Tensor Networks

Abstract Link: https://arxiv.org/abs/2012.13635

PDF Link: https://arxiv.org/pdf/2012.13635

Local Abstract: View Abstract

Local PDF: View PDF

Date: 02 Sep 2023

Title: Neurosymbolic Reinforcement Learning and Planning: A Survey

Abstract Link: https://arxiv.org/abs/2309.01038

PDF Link: https://arxiv.org/pdf/2309.01038

Local Abstract: View Abstract

Local PDF: View PDF

Date: 28 Dec 2017

Title: What do we need to build explainable AI systems for the medical domain?

Abstract Link: https://arxiv.org/abs/1712.09923

PDF Link: https://arxiv.org/pdf/1712.09923

Local Abstract: View Abstract

Local PDF: View PDF

Date: 19 Jan 2020

Title: A Hybrid Compact Neural Architecture for Visual Place Recognition

Abstract Link: https://arxiv.org/abs/1910.06840

PDF Link: https://arxiv.org/pdf/1910.06840

Local Abstract: View Abstract

Local PDF: View PDF

Date: 25 Jun 2018

Title: Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition

Abstract Link: https://arxiv.org/abs/1706.09569

PDF Link: https://arxiv.org/pdf/1706.09569

Local Abstract: View Abstract

Local PDF: View PDF

Date: 13 Jun 2021

Title: Compression of Deep Learning Models for Text: A Survey

Abstract Link: https://arxiv.org/abs/2008.05221

PDF Link: https://arxiv.org/pdf/2008.05221

Local Abstract: View Abstract

Local PDF: View PDF

Date: 01 Jun 2023

Title: A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU

Abstract Link: https://arxiv.org/abs/2305.17473

PDF Link: https://arxiv.org/pdf/2305.17473

Local Abstract: View Abstract

Local PDF: View PDF

Date: 20 Jun 2023

Title: Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics

Abstract Link: https://arxiv.org/abs/2212.08989

PDF Link: https://arxiv.org/pdf/2212.08989

Local Abstract: View Abstract

Local PDF: View PDF

Date: 13 Dec 2019

Title: From Shallow to Deep Interactions Between Knowledge Representation, Reasoning and Machine Learning (Kay R. Amel group)

Abstract Link: https://arxiv.org/abs/1912.06612

PDF Link: https://arxiv.org/pdf/1912.06612

Local Abstract: View Abstract

Local PDF: View PDF

Date: 26 Dec 2023

Title: Coordination and Machine Learning in Multi-Robot Systems: Applications in Robotic Soccer

Abstract Link: https://arxiv.org/abs/2312.16273

PDF Link: https://arxiv.org/pdf/2312.16273

Local Abstract: View Abstract

Local PDF: View PDF

Date: 24 Aug 2023

Title: An Analytic Layer-wise Deep Learning Framework with Applications to Robotics

Abstract Link: https://arxiv.org/abs/2102.03705

PDF Link: https://arxiv.org/pdf/2102.03705

Local Abstract: View Abstract

Local PDF: View PDF

Date: 20 Aug 2021

Title: DL-Traff: Survey and Benchmark of Deep Learning Models for Urban Traffic Prediction

Abstract Link: https://arxiv.org/abs/2108.09091

PDF Link: https://arxiv.org/pdf/2108.09091

Local Abstract: View Abstract

Local PDF: View PDF

Date: 23 Oct 2022

Title: A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective

Abstract Link: https://arxiv.org/abs/2209.13232

PDF Link: https://arxiv.org/pdf/2209.13232

Local Abstract: View Abstract

Local PDF: View PDF

Date: 29 Jun 2018

Title: Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

Abstract Link: https://arxiv.org/abs/1802.04730

PDF Link: https://arxiv.org/pdf/1802.04730

Local Abstract: View Abstract

Local PDF: View PDF

Date: 23 Nov 2020

Title: Integrating Deep Learning in Domain Sciences at Exascale

Abstract Link: https://arxiv.org/abs/2011.11188

PDF Link: https://arxiv.org/pdf/2011.11188

Local Abstract: View Abstract

Local PDF: View PDF

Date: 19 Apr 2023

Title: How to Do Things with Deep Learning Code

Abstract Link: https://arxiv.org/abs/2304.09406

PDF Link: https://arxiv.org/pdf/2304.09406

Local Abstract: View Abstract

Local PDF: View PDF

Date: 16 May 2017

Title: Functions that Emerge through End-to-End Reinforcement Learning - The Direction for Artificial General Intelligence -

Abstract Link: https://arxiv.org/abs/1703.02239

PDF Link: https://arxiv.org/pdf/1703.02239

Local Abstract: View Abstract

Local PDF: View PDF

🔍Run of Multi-Agent System Paper Summary Spec is Complete

Start time: 2024-10-05 11:24:01

Finish time: 2024-10-05 11:24:47

Elapsed time: 46.00 seconds

Based on the given abstracts, here are some technical AI papers that might integrate the mentioned applications with the provided ideas: 1. For Cosmos Database and Github AI: - "RoboStack: A Robot Operating System with Conda and Jupyter Integration" (Robot Operating System, Conda, Jupyter) [1] - "A Fuzzy Markup Language (FML) Based Semantic Brain-Computer Interface Agent for Go Learning and Prediction" (BCI agent, FML, Go) [2] 2. For Omni Text/Audio/Image/Video: - "VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset" (Vision, Audio, Subtitle, Text, Video, Foundation Model) [3] - "From Statistical Relational to Neurosymbolic Artificial Intelligence: A Survey" (Neurosymbolic AI, Statistical Relational AI) [4] 3. For Science Brain Focus for Research: - "A Survey on Neural Information Retrieval and Recommender Systems" (Neural IR, Recommender Systems) [5] - "Deep Learning for Scientific Discovery: A Review" (Deep Learning, Scientific Discovery) [6] 4. For Claude: - "A Deep Learning Approach to Person Re-identification" (Deep Learning, Person Re-identification) [7] - "A Survey on Deep Learning Methods for Speech Recognition" (Deep Learning, Speech Recognition) [8] 5. For Mobile Mixture of Experts Agent: - "Mobile Mixture-of-Experts: A Survey" (Mobile, Mixture of Experts) [9] - "Deep Learning for Mobile Sensor Data Analysis: A Review" (Deep Learning, Mobile Sensor Data Analysis) [10]

Sign up or log in to comment