Artificial Intelligence Preprint | 2019-07-04

in #artificial5 years ago

Artificial Intelligence


Benchmarking Model-Based Reinforcement Learning (1907.02057v1)

Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba

2019-07-03

Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL. However, research in model-based RL has not been very standardized. It is fairly common for authors to experiment with self-designed environments, and there are several separate lines of research, which are sometimes closed-sourced or not reproducible. Accordingly, it is an open question how these various existing MBRL algorithms perform relative to each other. To facilitate research in MBRL, in this paper we gather a wide collection of MBRL algorithms and propose over 18 benchmarking environments specially designed for MBRL. We benchmark these algorithms with unified problem settings, including noisy environments. Beyond cataloguing performance, we explore and unify the underlying algorithmic differences across MBRL algorithms. We characterize three key research challenges for future MBRL research: the dynamics bottleneck, the planning horizon dilemma, and the early-termination dilemma. Finally, to maximally facilitate future research on MBRL, we open-source our benchmark in http://www.cs.toronto.edu/~tingwuwang/mbrl.html.

Deep neural network-based classification model for Sentiment Analysis (1907.02046v1)

Donghang Pan, Jingling Yuan, Lin Li, Deming Sheng

2019-07-03

The growing prosperity of social networks has brought great challenges to the sentimental tendency mining of users. As more and more researchers pay attention to the sentimental tendency of online users, rich research results have been obtained based on the sentiment classification of explicit texts. However, research on the implicit sentiment of users is still in its infancy. Aiming at the difficulty of implicit sentiment classification, a research on implicit sentiment classification model based on deep neural network is carried out. Classification models based on DNN, LSTM, Bi-LSTM and CNN were established to judge the tendency of the user's implicit sentiment text. Based on the Bi-LSTM model, the classification model of word-level attention mechanism is studied. The experimental results on the public dataset show that the established LSTM series classification model and CNN classification model can achieve good sentiment classification effect, and the classification effect is significantly better than the DNN model. The Bi-LSTM based attention mechanism classification model obtained the optimal R value in the positive category identification.

Combining Q&A Pair Quality and Question Relevance Features on Community-based Question Retrieval (1907.02031v1)

Dong Li, Lin Li

2019-07-03

The Q&A community has become an important way for people to access knowledge and information from the Internet. However, the existing translation based on models does not consider the query specific semantics when assigning weights to query terms in question retrieval. So we improve the term weighting model based on the traditional topic translation model and further considering the quality characteristics of question and answer pairs, this paper proposes a communitybased question retrieval method that combines question and answer on quality and question relevance (T2LM+). We have also proposed a question retrieval method based on convolutional neural networks. The results show that Compared with the relatively advanced methods, the two methods proposed in this paper increase MAP by 4.91% and 6.31%.

Life is Random, Time is Not: Markov Decision Processes with Window Objectives (1901.03571v2)

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour

2019-01-11

The window mechanism was introduced by Chatterjee et al. [1] to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite execution, in contrast to the traditional objectives that only require correctness of behaviors in the limit. The window concept has proved its interest in a variety of two-player zero-sum games, thanks to the ability to reason about such time bounds in system specifications, but also the increased tractability that it usually yields. In this work, we extend the window framework to stochastic environments by considering the fundamental threshold probability problem in Markov decision processes for window objectives. That is, given such an objective, we want to synthesize strategies that guarantee satisfying runs with a given probability. We solve this problem for the usual variants of window objectives, where either the time frame is set as a parameter, or we ask if such a time frame exists. We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games. Our work paves the way to a wide use of the window mechanism in stochastic models. [1] Krishnendu Chatterjee, Laurent Doyen, Mickael Randour, and Jean-Fran\c{c}ois Raskin. Looking at mean-payoff and total-payoff through windows. Inf. Comput., 242:25-52, 2015.

Chasing Ghosts: Instruction Following as Bayesian State Tracking (1907.02022v1)

Peter Anderson, Ayush Shrivastava, Devi Parikh, Dhruv Batra, Stefan Lee

2019-07-03

A visually-grounded navigation instruction can be interpreted as a sequence of expected observations and actions an agent following the correct trajectory would encounter and perform. Based on this intuition, we formulate the problem of finding the goal location in Vision-And-Language Navigation (VLN) within the framework of Bayesian state tracking - learning observation and motion models conditioned on these expectable events. Together with a mapper that constructs a semantic spatial map on-the-fly during navigation, we formulate an end-to-end differentiable Bayes filter and train it to identify the goal by predicting the most likely trajectory through the map according to the instructions. The resulting navigation policy constitutes a new approach to instruction following that explicitly models a probability distribution over states, encoding strong geometric and algorithmic priors while enabling greater explainability. Our experiments show that our approach outperforms strong baselines when predicting the goal location in VLN.

Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias (1901.00433v2)

Patrick Forré, Joris M. Mooij

2019-01-02

We prove the main rules of causal calculus (also called do-calculus) for i/o structural causal models (ioSCMs), a generalization of a recently proposed general class of non-/linear structural causal models that allow for cycles, latent confounders and arbitrary probability distributions. We also generalize adjustment criteria and formulas from the acyclic setting to the general one (i.e. ioSCMs). Such criteria then allow to estimate (conditional) causal effects from observational data that was (partially) gathered under selection bias and cycles. This generalizes the backdoor criterion, the selection-backdoor criterion and extensions of these to arbitrary ioSCMs. Together, our results thus enable causal reasoning in the presence of cycles, latent confounders and selection bias. Finally, we extend the ID algorithm for the identification of causal effects to ioSCMs.

Anatomically Consistent Segmentation of Organs at Risk in MRI with Convolutional Neural Networks (1907.02003v1)

Pawel Mlynarski, Hervé Delingette, Hamza Alghamdi, Pierre-Yves Bondiau, Nicholas Ayache

2019-07-03

Planning of radiotherapy involves accurate segmentation of a large number of organs at risk, i.e. organs for which irradiation doses should be minimized to avoid important side effects of the therapy. We propose a deep learning method for segmentation of organs at risk inside the brain region, from Magnetic Resonance (MR) images. Our system performs segmentation of eight structures: eye, lens, optic nerve, optic chiasm, pituitary gland, hippocampus, brainstem and brain. We propose an efficient algorithm to train neural networks for an end-to-end segmentation of multiple and non-exclusive classes, addressing problems related to computational costs and missing ground truth segmentations for a subset of classes. We enforce anatomical consistency of the result in a postprocessing step, in particular we introduce a graph-based algorithm for segmentation of the optic nerves, enforcing the connectivity between the eyes and the optic chiasm. We report cross-validated quantitative results on a database of 44 contrast-enhanced T1-weighted MRIs with provided segmentations of the considered organs at risk, which were originally used for radiotherapy planning. In addition, the segmentations produced by our model on an independent test set of 50 MRIs are evaluated by an experienced radiotherapist in order to qualitatively assess their accuracy. The mean distances between produced segmentations and the ground truth ranged from 0.1 mm to 0.7 mm across different organs. A vast majority (96 %) of the produced segmentations were found acceptable for radiotherapy planning.

HINT: Hierarchical Invertible Neural Transport for General and Sequential Bayesian inference (1905.10687v2)

Gianluca Detommaso, Jakob Kruse, Lynton Ardizzone, Carsten Rother, Ullrich Köthe, Robert Scheichl

2019-05-25

In this paper, we introduce Hierarchical Invertible Neural Transport (HINT), an algorithm that merges Invertible Neural Networks and optimal transport to sample from a posterior distribution in a Bayesian framework. This method exploits a hierarchical architecture to construct a Knothe-Rosenblatt transport map between an arbitrary density and the joint density of hidden variables and observations. After training the map, samples from the posterior can be immediately recovered for any contingent observation. Any underlying model evaluation can be performed fully offline from training without the need of a model-gradient. Furthermore, no analytical evaluation of the prior is necessary, which makes HINT an ideal candidate for sequential Bayesian inference. We demonstrate the efficacy of HINT on two numerical experiments.

Cooperative Schedule-Driven Intersection Control with Connected and Autonomous Vehicles (1907.01984v1)

Hsu-Chieh Hu, Stephen F. Smith, Rick Goldstein

2019-07-03

Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. In this paper, we propose a cooperative algorithm that utilizes both connected and autonomous vehicles (CAV) and schedule-driven traffic control to create better traffic flow in the city. The algorithm enables an intersection scheduling agent to adjust the arrival time of an approaching platoon through use of wireless communication to control the velocity of vehicles. The sequence of approaching platoons is thus shifted toward a new shape that has smaller cumulative delay. We demonstrate how this algorithm outperforms the original approach in a real-time traffic signal control problem.

Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control (1907.01978v1)

Hsu-Chieh Hu, Stephen F. Smith

2019-07-03

Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. In order to achieve network level coordination in a scalable manner, scheduling agents communicate only with their direct neighbors. Each time an agent generates a new intersection schedule it communicates its expected outflows to its downstream neighbors as a prediction of future demand and these outflows are appended to the downstream agent's locally perceived demand. In this paper, we extend this basic coordination algorithm to additionally incorporate the complementary flow of information reflective of an intersection's current congestion level to its upstream neighbors. We present an asynchronous decentralized algorithm for updating intersection schedules and congestion level estimates based on these bi-directional information flows. By relating this algorithm to the self-optimized decision making of the basic operation, we are able to approach network-wide optimality and reduce inefficiency due to strictly self-interested intersection control decisions.