线上银河娱乐「诚|信」

    <dd id="0zz7k"></dd>

  1. <th id="0zz7k"></th>

    <rp id="0zz7k"><object id="0zz7k"><input id="0zz7k"></input></object></rp>

    1. Feature Construction for Inverse Reinforcement Learning

      Feature Construction for Inverse Reinforcement Learning

      Abstract

      The goal of inverse reinforcement learning is to find a reward function for a Markov decision process, given example traces from its optimal policy. Current IRL techniques generally rely on user-supplied features that form a concise basis for the reward. We present an algorithm that instead constructs reward features from a large collection of component features, by building logical conjunctions of those component features that are relevant to the example policy. Given example traces, the algorithm returns a reward function as well as the constructed features. The reward function can be used to recover a full, deterministic, stationary policy, and the features can be used to transplant the reward function into any novel environment on which the component features are well defined.

      Materials




      线上银河娱乐