# Probabilistic Consistency Engine (PCE)

Probabilistic Consistency Engine (PCE) combines possibly conflicting evidence from a collection of data sources into a most probable hypothesis consistent with evidence. PCE takes as input a collection of facts and weighted rules and generates the marginal probabilities of individual atoms and formulas, using mechanisms based on Markov Logic Networks.

## Overview

The Probability Consistency Engine (PCE) combines possibly conflicting evidence from a collection of data sources into a most probable hypothesis consistent with the evidence. In essence, PCE adjudicates over differing evidence to generate possible explanations with associated probabilities. PCE takes as input a collection of facts and weighted rules and generates the marginal probabilities of individual atoms and formulas, using mechanisms based on Markov Logic Networks. In particular, PCE uses MCSat and Lazy MCSat, which are variants of Markov Chain Monte Carlo (MCMC) sampling methods.

PCE was developed for PAL’s CALO Research System. That system had a variety of learners that dealt with entities such as people, emails, projects, meetings, and folders. These learners created rules and assertions that, for example, independently predicted which project an email was related to, or who should attend a meeting. When new information arrived, the learners provided their predicted assertions. PCE would compute marginal probabilities regarding the new (and old) assertions. Predictions were provided to the user; if the user changed the assignment to another project, the information was fed back to the learners, allowing them to adjust their rules and assertions to PCE.

For example, after a new email arrived, two learning algorithms might predict:

• Based on the subject matter, an 80% probability that the email belongs to project P1; 10% probability that it belongs to project P2; 10% probability that email belongs elsewhere.
• Based on the sender and recipients, a 60% probability that the email belongs to project P2; 30% probability that the email belongs to project P1; 10% probability that it belongs elsewhere.

PCE would compute marginal probabilities and assert that the email belongs to (say) project P1 with 60% probability, and to project P2 with 30% probability.

The input to PCE consists of a sort hierarchy, constants, direct and indirect predicates, atomic facts asserting a direct predicate of specific constants, and rules that are universally quantified formulas in both the direct and indirect predicates along with their associated weights. A direct predicate is one that is observable (e.g., that email was received from a specific email address, with a specific content). An indirect predicate is one that can only be inferred (e.g., that the email is associated with a particular project). PCE is then queried for formula patterns, new samples are run, and the formula instances are output, along with their marginal probabilities.

Potential applications of PCE include prediction, Bayesian analysis, cooperative learning, language analysis, probabilistic bounded model checking, timing and failure analysis, and probabilistic optimization. PCE is being used in the DARPA Machine Reading program as a harness to mediate outputs from a range of learners.

## Limitations

• PCE lacks a facility for explaining its conclusions.

Overview: DISTAR 14982 – Approved for Public Release, Distribution Unlimited
API and Example: DISTAR 15262 – Approved for Public Release, Distribution Unlimited
Source and Object Code: DISTAR 14502 – Approved for Public Release, Distribution Unlimited