[Jairsubscribers] 15 new articles published by JAIR

jair-ed@ISI.EDU jair-ed at ISI.EDU
Mon May 25 21:14:18 PDT 2009


Dear JAIR subscriber:
 This message lists papers that have been recently published in JAIR and describes how to access them. (If you wish to remove yourself from this mailing list, see instructions at the end of this message.)

----------------------------------------------------------------
I. New JAIR Articles

S.  A. Wallace (2009)
  "Behavior Bounding: An Efficient Method for High-Level Behavior Comparison",
   Volume 34, pages 165-208

   For quick access go to <http://www.jair.org/papers/paper2646.html>

Abstract:
In this paper, we explore methods for comparing agent behavior with human behavior to assist with validation. Our exploration begins by considering a simple method of behavior comparison. Motivated by shortcomings in this initial approach, we introduce behavior bounding, an automated model-based approach for comparing behavior that is inspired, in part, by Mitchell’s Version Spaces. We show that behavior bounding can be used to compactly represent both human and agent behavior. We argue that relatively low amounts of human effort are required to build, maintain, and use the data structures that underlie behavior bounding, and we provide a theoretical basis for these arguments using notions of PAC Learnability. Next, we show empirical results indicating that this approach is effective at identifying differences in certain types of behaviors and that it performs well when compared against our initial benchmark methods. Finally, we demonstrate that behavior bounding can produce information that allows developers to identify and fix problems in an agent’s behavior much more efficiently than standard debugging techniques.


R.  Jurca and B.  Faltings (2009)
  "Mechanisms for Making Crowds Truthful",
   Volume 34, pages 209-253

   For quick access go to <http://www.jair.org/papers/paper2621.html>

Abstract:
We consider schemes for obtaining truthful reports on a common but hidden signal from large groups of rational, self-interested agents. One example are online feedback mechanisms, where users provide observations about the quality of a product or service so that other users can have an accurate idea of what quality they can expect. However, (i) providing such feedback is costly, and (ii) there are many motivations for providing incorrect feedback.

Both problems can be addressed by reward schemes which (i) cover the cost of obtaining and reporting feedback, and (ii) maximize the expected reward of a rational agent who reports truthfully. We address the design of such incentive-compatible rewards for feedback generated in environments with pure adverse selection. Here, the correlation between the true knowledge of an agent and her beliefs regarding the likelihoods of reports of other agents can be exploited to make honest reporting a Nash equilibrium.

In this paper we extend existing methods for designing incentive-compatible rewards by also considering collusion. We analyze different scenarios, where, for example, some or all of the agents collude. For each scenario we investigate whether a collusion-resistant, incentive-compatible reward scheme exists, and use automated mechanism design to specify an algorithm for deriving an efficient reward mechanism.


P.  Doshi and P.  J Gmytrasiewicz (2009)
  "Monte Carlo Sampling Methods for Approximating Interactive POMDPs",
   Volume 34, pages 297-337

   For quick access go to <http://www.jair.org/papers/paper2630.html>

Abstract:
Partially observable Markov decision processes (POMDPs) provide a principled framework for sequential planning in uncertain single agent settings. An extension of POMDPs to multiagent settings, called interactive POMDPs (I-POMDPs), replaces POMDP belief spaces with interactive hierarchical belief systems which represent an agent’s belief about the physical world, about beliefs of other agents, and about their beliefs about others’ beliefs. This modification makes the difficulties of obtaining solutions due to complexity of the belief and policy spaces even more acute. We describe a general method for obtaining approximate solutions of I-POMDPs based on particle filtering (PF). We introduce the interactive PF, which descends the levels of the interactive belief hierarchies and samples and propagates beliefs at each level. The interactive PF is able to mitigate the belief space complexity, but it does not address the policy space complexity. To mitigate the policy space complexity – sometimes also called the curse of history – we utilize a complementary method based on sampling likely observations while building the look ahead reachability tree. While this approach does not completely address the curse of history, it beats back the curse’s impact substantially. We provide experimental results and chart future work.


A.  Yates and O.  Etzioni (2009)
  "Unsupervised Methods for Determining Object and Relation Synonyms on the Web",
   Volume 34, pages 255-296

   For quick access go to <http://www.jair.org/papers/paper2772.html>

Abstract:
The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fully-implemented system that runs in O(KN log N) time in the number of extractions, N, and the maximum number of synonyms per word, K. The system, called Resolver , introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. On a set of two million assertions extracted from the Web, Resolver resolves objects with 78% precision and 68% recall, and resolves relations with 90% precision and 35% recall. Several variations of resolver's probabilistic model are explored, and experiments demonstrate that under appropriate conditions these variations can improve F1 by 5%. An extension to the basic Resolver system allows it to handle polysemous names with 97% precision and 95% recall on a data set from the TREC corpus.


Y.  Li, P.  Musilek, M.  Reformat and L.  Wyard-Scott (2009)
  "Identification of Pleonastic It Using the Web",
   Volume 34, pages 339-389

   For quick access go to <http://www.jair.org/papers/paper2622.html>

Abstract:
In a significant minority of cases, certain pronouns, especially the pronoun it, can be used without referring to any specific entity. This phenomenon of pleonastic pronoun usage poses serious problems for systems aiming at even a shallow understanding of natural language texts. In this paper, a novel approach is proposed to identify such uses of it: the extrapositional cases are identified using a series of queries against the web, and the cleft cases are identified using a simple set of syntactic rules. The system is evaluated with four sets of news articles containing 679 extrapositional cases as well as 78 cleft constructs. The identification results are comparable to those obtained by human efforts.


F.  Bacchus, S.  Dalmao and T.  Pitassi (2009)
  "Solving #SAT and Bayesian Inference with Backtracking Search",
   Volume 34, pages 391-442

   For quick access go to <http://www.jair.org/papers/paper2648.html>

Abstract:
Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-products problem with time complexity that is at least as good any other state-of-the-art exact algorithm, and that it can also achieve the best known time-space tradeoff. Furthermore, backtracking’s ability to utilize more flexible variable orderings allows us to prove that it can achieve an exponential speedup over other standard algorithms for SUMPROD on some instances.

The ideas presented here have been utilized in a number of solvers that have been applied to various types of sum-of-product problems. These system’s have exploited the fact that backtracking can naturally exploit more of the problem’s structure to achieve improved performance on a range of probleminstances. Empirical evidence of this performance gain has appeared in published works describing these solvers, and we provide references to these works.


E.  Gabrilovich and S.  Markovitch (2009)
  "Wikipedia-based Semantic Interpretation for Natural Language Processing",
   Volume 34, pages 443-498

   For quick access go to <http://www.jair.org/papers/paper2669.html>

Abstract:
Adequate representation of natural language semantics requires
access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as WordNet, or on huge manual efforts such as the CYC project. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of concepts derived from Wikipedia, the largest encyclopedia in existence. We explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. Using ESA results in significant improvements over the previous state of the art in both tasks. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.


V.  Ruiz de Angulo and C.  Torras (2009)
  "Exploiting Single-Cycle Symmetries in Continuous Constraint Problems",
   Volume 34, pages 499-520

   For quick access go to <http://www.jair.org/papers/paper2711.html>

Abstract:
Symmetries in discrete constraint satisfaction problems have been explored and exploited in the last years, but symmetries in continuous constraint problems have not received the same attention. Here we focus on permutations of the variables consisting of one single cycle. We propose a procedure that takes advantage of these symmetries by interacting with a continuous constraint solver without interfering with it. A key concept in this procedure are the classes of symmetric boxes formed by bisecting a n-dimensional cube at the same point in all dimensions at the same time. We analyze these classes and quantify them as a function of the cube dimensionality. Moreover, we propose a simple algorithm to generate the representatives of all these classes for any number of variables at very high rates. A problem example from the chemical field and the cyclic n-roots problem are used to show the performance of the approach in practice.


T.  Rahwan, S.  D. Ramchurn, N.  R. Jennings and A.  Giovannucci (2009)
  "An Anytime Algorithm for Optimal Coalition Structure Generation",
   Volume 34, pages 521-567

   For quick access go to <http://www.jair.org/papers/paper2695.html>

Abstract:
Coalition formation is a fundamental type of interaction that involves the creation of coherent groupings of distinct, autonomous, agents in order to efficiently achieve their individual or collective goals. Forming effective coalitions is a major research challenge in the field of  multi-agent systems. Central to this endeavour is the problem of determining which of the many possible coalitions to form in order to achieve some goal. This usually requires calculating a value for every possible coalition, known as the coalition value, which indicates how beneficial that coalition would be if it was formed. Once these values  are calculated, the agents usually need to find a combination of  coalitions, in which every agent belongs to exactly one coalition, and by which the overall outcome of the system is maximized. However, this coalition structure generation problem is extremely challenging due to the number of possible solutions that need to be examined, which grows exponentially with the number of agents involved. To date, therefore, many algorithms have been proposed to solve this problem using different techniques ranging from dynamic programming, to integer programming, to stochastic search all of which suffer from major limitations relating to execution time, solution quality, and memory requirements.

With this in mind, we develop an anytime algorithm to solve the coalition structure generation problem. Specifically, the algorithm uses a novel representation of the search space, which partitions the space of possible solutions into sub-spaces such that it is possible to compute upper and lower bounds on the values of the best coalition structures in them. These bounds are then used to identify the sub-spaces that have no potential of containing the optimal solution so that they can be pruned. The algorithm, then, searches through the remaining sub-spaces very efficiently using a branch-and-bound technique to avoid examining all the solutions within the searched subspace(s). In this setting, we prove that our algorithm enumerates all coalition structures efficiently by avoiding redundant and invalid solutions automatically. Moreover, in order to effectively test our algorithm we develop a new type of input distribution which allows us to generate more reliable benchmarks compared to the input distributions previously used in the field. Given this new distribution, we show that for 27 agents our algorithm is able to find solutions that are optimal in 0.175% of the time required by the fastest available algorithm in the literature. The algorithm is anytime, and if interrupted before it would have normally terminated, it can still provide a solution that is guaranteed to be within a bound from the optimal one. Moreover, the guarantees we provide on the quality of the solution are significantly better than those provided by the previous state of the art algorithms designed for this purpose. For example, for the worst case distribution given 25 agents, our algorithm is able to find a 90% efficient solution in around 10% of time it takes to find the optimal solution.


S.  R. K. Branavan, H.  Chen, J.  Eisenstein and R.  Barzilay (2009)
  "Learning Document-Level Semantic Properties from Free-Text Annotations",
   Volume 34, pages 569-603

   For quick access go to <http://www.jair.org/papers/paper2633.html>

Abstract:
This paper presents a new method for inferring the semantic properties of documents by leveraging free-text keyphrase annotations.  Such annotations are becoming increasingly abundant due to the recent dramatic growth in semi-structured, user-generated online content. One especially relevant domain is product reviews, which are often annotated by their authors with pros/cons keyphrases such as ``a real bargain'' or ``good value.'' These annotations are representative of the underlying semantic properties; however, unlike expert annotations, they are noisy: lay authors may use different labels to denote the same property, and some labels may be missing.  To learn using such noisy annotations, we find a hidden paraphrase structure which clusters the keyphrases.  The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews.  Our approach is implemented as a hierarchical Bayesian model with joint inference.  We find that joint inference increases the robustness of the keyphrase clustering and encourages the latent topics to correlate with semantically meaningful properties.  Multiple evaluations demonstrate that our model substantially outperforms alternative approaches for summarizing single and multiple documents into a set of semantically salient keyphrases.


F.  Sanchez-Martinez and M.  L. Forcada (2009)
  "Inferring Shallow-Transfer Machine Translation Rules from Small Parallel Corpora",
   Volume 34, pages 605-635

   For quick access go to <http://www.jair.org/papers/paper2735.html>

Abstract:
This paper describes a method for the automatic inference of structural transfer rules to be used in a shallow-transfer machine translation (MT) system from small parallel corpora. The structural transfer rules are based on alignment templates, like those used in statistical MT. Alignment templates are extracted from sentence-aligned parallel corpora and extended with a set of restrictions which are derived from the bilingual dictionary of the MT system and control their application as transfer rules. The experiments conducted using three different language pairs in the free/open-source MT platform Apertium show that translation quality is improved as compared to word-for-word translation (when no transfer rules are used), and that the resulting translation quality is close to that obtained using hand-coded transfer rules. The method we present is entirely unsupervised and benefits from information in the rest of modules of the MT system in which the inferred rules are applied.



T.  A. Cohn and M.  Lapata (2009)
  "Sentence Compression as Tree Transduction",
   Volume 34, pages 637-674

   For quick access go to <http://www.jair.org/papers/paper2655.html>

Abstract:
This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture structural mismatches. We describe an algorithm for decoding in this framework and show how the model can be trained discriminatively within a large margin framework.  Experimental results on sentence compression bring significant improvements over a state-of-the-art model.


O.  Gimenez and A.  Jonsson (2009)
  "Planning over Chain Causal Graphs for Variables with Domains of Size 5 Is NP-Hard",
   Volume 34, pages 675-706

   For quick access go to <http://www.jair.org/papers/paper2742.html>

Abstract:
Recently, considerable focus has been given to the problem of determining the boundary between tractable and intractable planning problems. In this paper, we study the complexity of planning in the class C_n of planning problems, characterized by unary operators and directed path causal graphs. Although this is one of the simplest forms of causal graphs a planning problem can have, we show that planning is intractable for C_n (unless P = NP), even if the domains of state variables have bounded size. In particular, we show that plan existence for C_n^k is NP-hard for k>=5 by reduction from CNFSAT. Here, k denotes the upper bound on the size of the state variable domains. Our result reduces the complexity gap for the class C_n^k to cases k=3 and k=4 only, since C_n^2 is known to be tractable.


A.  Singh, A.  Krause, C.  Guestrin and W.  J. Kaiser (2009)
  "Efficient Informative Sensing using Multiple Robots",
   Volume 34, pages 707-755

   For quick access go to <http://www.jair.org/papers/paper2674.html>

Abstract:
The need for efficient monitoring of spatio-temporal dynamics in large environmental applications, such as the water quality monitoring in rivers and lakes, motivates the use of robotic sensors in order to achieve sufficient spatial coverage. Typically, these robots have bounded resources, such as limited battery or limited amounts of time to obtain measurements. Thus, careful coordination of their paths is required in order to maximize the amount of information collected, while respecting the resource constraints. In this paper, we present an efficient approach for near-optimally solving the NP-hard optimization problem of planning such informative paths. In particular, we first develop eSIP (efficient Single-robot Informative Path planning), an approximation algorithm for optimizing the path of a single robot. Hereby, we use a Gaussian Process to model the underlying phenomenon, and use the mutual information between the visited locations and remainder of the space to quantify the amount of information collected. We prove that the mutual information collected using paths obtained by using eSIP is close to the information obtained by an optimal solution. We then provide a general technique, sequential allocation, which can be used to extend any single robot planning algorithm, such as eSIP, for the multi-robot problem. This procedure approximately generalizes any guarantees for the single-robot problem to the multi-robot case. We extensively evaluate the effectiveness of our approach on several experiments performed in-field for two important environmental sensing applications, lake and river monitoring, and simulation experiments performed using several real world sensor network data sets.


M.  Zaffalon and E.  Miranda (2009)
  "Conservative Inference Rule for Uncertain Reasoning under Incompleteness",
   Volume 34, pages 757-821

   For quick access go to <http://www.jair.org/papers/paper2736.html>

Abstract:
In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process' behaviour to be partly unknown. Then we use Walley's theory of coherent lower previsions, a generalisation of the Bayesian theory to imprecision, to derive the rule to update beliefs under incompleteness that logically follows from our assumptions, and that we call conservative inference rule. This rule has some remarkable properties: it is an abstract rule to update beliefs that can be applied in any situation or domain; it gives us the opportunity to be neither too optimistic nor too pessimistic about the incompleteness process, which is a necessary condition to draw reliable while strong enough conclusions; and it is a coherent rule, in the sense that it cannot lead to inconsistencies. We give examples to show how the new rule can be applied in expert systems, in parametric statistical inference, and in pattern classification, and discuss more generally the view of incompleteness processes defended here as well as some of its consequences.




----------------------------------------------------------------
II. Unsubscribing from our Mailing List

To remove yourself from the JAIR subscribers mailing list, visit our
Web site (http://www.jair.org/), follow the link "notify me of new
articles", enter your email address in the form at the bottom of the
page, and follow the directions.  In the event that you've already
deleted yourself from the list and we keep sending you messages like
this one, send mail to jair-ed at isi.edu.

----------------------------------------------------------------



More information about the Jairsubscribers mailing list