Artificial Intelligence Preprint | 2019-07-10

in #artificial5 years ago

Artificial Intelligence


Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scenes (1903.03227v3)

Bohan Wu, Iretiayo Akinola, Peter K. Allen

2019-03-08

Recent advances in on-policy reinforcement learning (RL) methods enabled learning agents in virtual environments to master complex tasks with high-dimensional and continuous observation and action spaces. However, leveraging this family of algorithms in multi-fingered robotic grasping remains a challenge due to large sim-to-real fidelity gaps and the high sample complexity of on-policy RL algorithms. This work aims to bridge these gaps by first reinforcement-learning a multi-fingered robotic grasping policy in simulation that operates in the pixel space of the input: a single depth image. Using a mapping from pixel space to Cartesian space according to the depth map, this method transfers to the real world with high fidelity and introduces a novel attention mechanism that substantially improves grasp success rate in cluttered environments. Finally, the direct-generative nature of this method allows learning of multi-fingered grasps that have flexible end-effector positions, orientations and rotations, as well as all degrees of freedom of the hand.

A Conformance Checking-based Approach for Drift Detection in Business Processes (1907.04276v1)

Víctor Gallego-Fontenla, Juan C. Vidal, Manuel Lama

2019-07-09

Real life business processes change over time, in both planned and unexpected ways. The detection of these changes is crucial for organizations to ensure that the expected and the real behavior are as similar as possible. These changes over time are called concept drift and its detection is a big challenge in process mining since the inherent complexity of the data makes difficult distinguishing between a change and an anomalous execution. In this paper, we present C2D2 (Conformance Checking-based Drift Detection), a new approach to detect sudden control-flow changes in the process models from event traces. C2D2 combines discovery techniques with conformance checking methods to perform an offline detection. Our approach has been validated with a synthetic benchmarking dataset formed by 68 logs, showing an improvement in the accuracy while maintaining a minimum delay in the drift detection.

A Scheme for Dynamic Risk-Sensitive Sequential Decision Making (1907.04269v1)

Shuai Ma, Jia Yuan Yu, Ahmet Satir

2019-07-09

We present a scheme for sequential decision making with a risk-sensitive objective and constraints in a dynamic environment. A neural network is trained as an approximator of the mapping from parameter space to space of risk and policy with risk-sensitive constraints. For a given risk-sensitive problem, in which the objective and constraints are, or can be estimated by, functions of the mean and variance of return, we generate a synthetic dataset as training data. Parameters defining a targeted process might be dynamic, i.e., they might vary over time, so we sample them within specified intervals to deal with these dynamics. We show that: i). Most risk measures can be estimated using return variance; ii). By virtue of the state-augmentation transformation, practical problems modeled by Markov decision processes with stochastic rewards can be solved in a risk-sensitive scenario; and iii). The proposed scheme is validated by a numerical experiment.

Depth with Nonlinearity Creates No Bad Local Minima in ResNets (1810.09038v3)

Kenji Kawaguchi, Yoshua Bengio

2018-10-21

In this paper, we prove that depth with nonlinearity creates no bad local minima in a type of arbitrarily deep ResNets with arbitrary nonlinear activation functions, in the sense that the values of all local minima are no worse than the global minimum value of corresponding classical machine-learning models, and are guaranteed to further improve via residual representations. As a result, this paper provides an affirmative answer to an open question stated in a paper in the conference on Neural Information Processing Systems 2018. This paper advances the optimization theory of deep learning only for ResNets and not for other network architectures.

On the Semantic Interpretability of Artificial Intelligence Models (1907.04105v1)

Vivian S. Silva, André Freitas, Siegfried Handschuh

2019-07-09

Artificial Intelligence models are becoming increasingly more powerful and accurate, supporting or even replacing humans' decision making. But with increased power and accuracy also comes higher complexity, making it hard for users to understand how the model works and what the reasons behind its predictions are. Humans must explain and justify their decisions, and so do the AI models supporting them in this process, making semantic interpretability an emerging field of study. In this work, we look at interpretability from a broader point of view, going beyond the machine learning scope and covering different AI fields such as distributional semantics and fuzzy logic, among others. We examine and classify the models according to their nature and also based on how they introduce interpretability features, analyzing how each approach affects the final users and pointing to gaps that still need to be addressed to provide more human-centered interpretability solutions.

Procedural Content Generation through Quality Diversity (1907.04053v1)

Daniele Gravina, Ahmed Khalifa, Antonios Liapis, Julian Togelius, Georgios N. Yannakakis

2019-07-09

Quality-diversity (QD) algorithms search for a set of good solutions which cover a space as defined by behavior metrics. This simultaneous focus on quality and diversity with explicit metrics sets QD algorithms apart from standard single- and multi-objective evolutionary algorithms, as well as from diversity preservation approaches such as niching. These properties open up new avenues for artificial intelligence in games, in particular for procedural content generation. Creating multiple systematically varying solutions allows new approaches to creative human-AI interaction as well as adaptivity. In the last few years, a handful of applications of QD to procedural content generation and game playing have been proposed; we discuss these and propose challenges for future work.

Estimating Mass Distribution of Articulated Objects through Physical Interaction (1907.03964v1)

Niranjan Kumar Kannabiran, Irfan Essa, C. Karen Liu

2019-07-09

We explore the problem of estimating the mass distribution of an articulated object by an interactive agent. Our method predicts the mass distribution accurately only using information that can be reliably acquired by the limited sensing and actuating capabilities of a robotic agent that interacts with it. Inspired by the role of exploratory play in human infants, we take the combined approach of supervised and reinforcement learning to train the agent such that it learns to strategically interact with the object for estimating its mass distribution. Our method consists of two neural networks: the policy network which decides how to interact with the object, and the predictor network that estimates the mass distribution given a history of observations and interactions. Using our method, we train a robotic arm to estimate the mass distribution of an object with moving parts (e.g. an articulated rigid body system) by pushing it on a surface with unknown friction properties. We also test the robustness of our learned model by transferring it to another robot arm with different end-effector geometry. The empirical results show that our method significantly outperforms the baseline agent which uses random pushes to collect data for estimation.

Learning by Abstraction: The Neural State Machine (1907.03950v1)

Drew A. Hudson, Christopher D. Manning

2019-07-09

We introduce the Neural State Machine, seeking to bridge the gap between the neural and symbolic views of AI and integrate their complementary strengths for the task of visual reasoning. Given an image, we first predict a probabilistic graph that represents its underlying semantics and serves as a structured world model. Then, we perform sequential reasoning over the graph, iteratively traversing its nodes to answer a given question or draw a new inference. In contrast to most neural architectures that are designed to closely interact with the raw sensory data, our model operates instead in an abstract latent space, by transforming both the visual and linguistic modalities into semantic concept-based representations, thereby achieving enhanced transparency and modularity. We evaluate our model on VQA-CP and GQA, two recent VQA datasets that involve compositionality, multi-step inference and diverse reasoning skills, achieving state-of-the-art results in both cases. We provide further experiments that illustrate the model's strong generalization capacity across multiple dimensions, including novel compositions of concepts, changes in the answer distribution, and unseen linguistic structures, demonstrating the qualities and efficacy of our approach.

Understanding Player Engagement and In-Game Purchasing Behavior with Ensemble Learning (1907.03947v1)

Anna Guitart, Ana Fernández del Río, África Periáñez

2019-07-09

As video games attract more and more players, the major challenge for game studios is to retain them. We present a deep behavioral analysis of churn (game abandonment) and what we called "purchase churn" (the transition from paying to non-paying user). A series of churning behavior profiles are identified, which allows a classification of churners in terms of whether they eventually return to the game (false churners)--or start purchasing again (false purchase churners)--and their subsequent behavior. The impact of excluding some or all of these churners from the training sample is then explored in several churn and purchase churn prediction models. Our results suggest that discarding certain combinations of "zombies" (players whose activity is extremely sporadic) and false churners has a significant positive impact in all models considered.

Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks in Smart Cities (1906.05015v5)

Ming Zhu, Xiao-Yang Liu, Xiaodong Wang

2019-06-12

Unmanned aerial vehicles (UAVs) are envisioned to complement the 5G communication infrastructure in future smart cities. Hot spots easily appear in road intersections, where effective communication among vehicles is challenging. UAVs may serve as relays with the advantages of low price, easy deployment, line-of-sight links, and flexible mobility. In this paper, we study a UAV-assisted vehicular network where the UAV jointly adjusts its transmission power and bandwidth allocation under 3D flight to maximize the total throughput. First, we formulate a Markov Decision Process (MDP) problem by modeling the mobility of the UAV/vehicles and the state transitions. Secondly, we solve the target problem using a deep reinforcement learning method, namely, the deep deterministic policy gradient, and propose three solutions with different control objectives. Then we extend the proposed solutions considering of the energy consumption of 3D flight. Thirdly, in a simplified model with small state space and action space, we verify the optimality of proposed algorithms. Comparing with two baseline schemes, we demonstrate the effectiveness of proposed algorithms in a realistic model.