% ipmudpgh(96)
%
%
\documentstyle[ipmu]{article}
\def\N{\mbox{I\hspace{-0.11em}N}}
%
\title{Application of a Symbolic Probability Theory to
Qualitative Reasoning under Uncertain Information}
%
\author{ {\bf Daniel PACHOLCZYK} \\
LERIA - Facult\'{e} des Sciences\\
Universit\'{e} d'Angers,\\
2 Boulevard Lavoisier\\
49045 ANGERS CEDEX\\
T\'{e}l. 19 33 41 73 54 68\\
Fax. 19 33 41 73 54 54\\
pacho@univ-angers.fr \\
\And
{\bf Gilles HUNAULT} \\
LERIA - Facult\'{e} des Sciences\\
Universit\'{e} d'Angers,\\ 2 Boulevard Lavoisier\\
49045 ANGERS CEDEX\\
T\'{e}l. 19 33 41 73 54 68\\
Fax. 19 33 41 73 54 54\\
gilles.hunault@univ-angers.fr \\
\And
{\bf Jean-Marc PACHOLCZYK}\\
LAFORIA - Equipe Th\'{e}orie des\\ repr\'{e}sentations cognitives\\
Institut B. PAscal, CNRS (URA 1095),\\
Universit\'{e} Curie, Paris VI,\\
4 place Jussieu,\\ 75252 PARIS CEDEX 05
}
%
\begin{document}
\maketitle
\renewcommand{\thesection}{\arabic{section}.}
\renewcommand{\thesubsection}{\arabic{section}.\arabic{subsection}}
\font\ghm=cmr6
\font\ghp=cmr7
%
\begin{abstract}
This paper deals with the Representation and the
Symbolic Management of Uncertain information.
After a presentation of some examples, we give the outlines of
our Symbolic Probability Theory,
ending with a discussion about the size of the the model and its
influence on the results.
The results given here show that such a
Symbolic Probability
Theory constitutes a satisfying model of
management of Qualitative Reasoning under uncertain
information encoded in a
qualitative way.
\end{abstract}
%
\section{INTRODUCTION }% 1
%
We present here some examples of Qualitative
Reasoning using a
Symbolic Probability Theory. The approach developped here is
Qualitative; thus, no Numerical computing will be found. During
the last ten years, a new concept, viewed as a Symbolic extension
of the classical concept of Probability has been studied in order
to represent symbolically the Uncertainty in Bayesian
Networks or the Uncertainty of statementsncertainty and Vagueness
\cite{pacd92b,pacd93},
both expressed in a
qualitative way.
A first scale of degrees of Truth enables to express
the graduation of the Vagueness.
A second scale of
Uncertainty degrees, distinct from the first, is used to express
the graduation of the Uncertainty.
All the
definitions and all the results presented here will be
translated with the aid
of a certainty function of statements, called
\texttt{Cert}. Thus, the statement:
"it is \textsf{rather-probable} that Smith be a
\textit{\textsf{very}} man", whose formal translation is
${\mathcal{A}}\ {\models}_{\textsf{rather}}$
\texttt{Cert}(Smith is a \textit{\textsf{very}} rich man)
will be expressed as:
\texttt{Cert}(Smith is a \textit{\textsf{very}} rich man)
= \textsf{rather}(\texttt{Probable}).
Section 2 will provide some examples written in Natural Language
and their treatment using the formulas and rules from our
Symbolic Probability Theory. Section 3 deals with the theoretical
framework.
Section 4 will discuss about the size of the model and
the different possible values in a chosen model, showing that
coherent overlapping results can be found for any M.
\textbf{Remark:}
Let's recall that
the psychologists
consider that the symbolic graduation can not be reasonnably
apprehended by man above ten degrees
\cite{leny87,maov90,rbgh90}.
%
\section{THE EXAMPLES}% 2
%
In many situations, the cognitive agent knows that a statement is either
\textit{true} or \textit{false} but he is unable to decide which one
is the correct. He can, at the most, \textit{grade symbolically his Certainty} ranging from
\textsf{Impossible} to \textsf{Certain}. The graduation $v_\alpha$ of his certainty will be
\textit{expressed linguistically} by statements
such as "A is $v_\alpha$ probable" or "the Certainty of A is $v_\alpha$".
That's why we have set the study of the Uncertainty in the
framework of a many-valued first order Logic:
the cognitive agent has only a partial knowledge of the
Truth. Due to the fact
that he doesn't come close enough to this degree of
Certainty in a numerical way
but in a symbolic way, which is, by nature, discrete,
the evaluation of the uncertainty
related to a statement can only be an "approximation".
Results are shown with the help of
a \textit{graduation scale} of M symbolic values:
${\mathcal{B}}_{M}$=\{$v_\alpha, \alpha=1...M$\}.
It is totally ordered by the relation
$v_\alpha\leq{}v_\beta\Leftrightarrow\alpha\leq\beta$.
For example, with M=7, we can propose as
linguistic translation of
these symbolic degrees the expressions given by
the following set:
${\mathcal{B}}_{7}$=\{$v_1...v_7$\}=%
\textsf{\{impossible, very-little-probable, little-probable,
probable,
rather-probable, very-probable, certain\}}.
The following examples deal with real situations, that is, situations that can be found
in the real word. If it impossible to give a numerical probability about the
considered events, it is always possible to give a certainty state or a certainty
degree to them. The examples will be treated in ${\mathcal{L}}_{7}$.
The first example (the magician)
illustrates the \textit{Principle of Total Certainty} while the
second example (the fire outbreaks) shows also the
the \textit{Symbolic Generalised Bayes' Formula}
whereas the third (shapes of digits and parity)
uses also the \textit{Propagation Rule}.
%
\subsection{The magician}% 2.1
%
Each night, in a cabaret, a magician performs a card
trick. He chooses a person in the room and let him draw
a card out of a pack of 32. He then forecasts
systematically the event B: "You drew an ace". Of course, he has
some accomplices in the room, who are \textsf{certain} to draw an
ace. The trick would be perfect if, every night,
anonymous person didn't present as volunteers. Let's suppose
that it is \textsf{little-probable} that "the person that draws the
card is an accomplice" and \textsf{rather-probable} that "the person that draws the
card is not an accomplice". Knowing that, for a normal
person, it is \textsf{very-little-probable} that "the card is an
ace", our magician would like to know, before performing
his show, what certainty he can expect on the fact that
"an ace is drawn". Let A be the event "the person is an accomplice".
Since we have \textsf{Cert}(B$\vert$A) = \textsf{certain}
and \textsf{Cert}(B$\vert\neg$A) = \textsf{very-little-probable},
by the \textit{Total Certainty Formula}, we get \textsf{Cert(B)} = \textsf{probable},
so it is \textsf{probable} that "the person draws an
ace". \textit{The risks of contest being too big, he abandons his trick}.
%
\subsection{Outbreak of fire and prevention}% 2.2
%
Lets' now illustrate the theory with an example for a DDSIS (a French
acronym for the Direction of a Department of Fire and Help Services).
The Director of this Department wants to know the certainty he may have about the risks of a
forest fire in his geographic area. He considers then the SDACR (acronym for a
Departmental Scheme of Analysis and Coverage of Risks)
classification of areas: "high risks area" (A1), "low and middle risks area" (A2) or
"extremely weak risks area" (A3). It is then \textsf{probable}
that an area is an A1, \textsf{little-probable} that it is an A2
and \textsf{very-little-probable} that it is an A3.
He then esteems that it is \textsf{very-probable}
that "there will be risks of a forest fire" (B),
knowing the area is an A1, \textsf{rather-probable} that there will be risks of forest
fire knowing the area is an A2, and that it is \textsf{little-probable}
that there will be risks of forest fire knowing the are is an A3.
With the \textit{Total Certainty Formula}, our director
knows then that it is \textsf{rather-probable} that there will be a risk of a forest fire.
The DDSIS would also study the need for buying a new HFPT, that is, a heavy FPT
(an acronym for Ton Pump Van that is hardware of big capacities for massive fight againt the fire,
including helicopters, canad-airs...). He thus considers (C): "a new HFPT is needed".
Considering the fleet of heavy and light FPT, he esteems that it is
\textsf{impossible} that a HFPT is needed in an A3 or in an A2
but that it is \textsf{rather-probable} that it is needed in an A1.
After another thought, using the \textit{Total Certainty Formula}, the DDSIS esteems that it is
beforehand also \textsf{rather-probable} that a HFPT is needed.
The DDSIS would now refine this result knowing that its area is an high risks area.
From the \textit{Bayes' Formula} its director is able to conclude that it is at least
\textsf{very-little-probable} and at
the most \textsf{probable} that a new HFPT is needed.
\textit{From these considerations, he
decides to postpone the buying to another date}.
%
\subsection{Shapes of digits and parity}% 2.3
%
Let's now meet a young scholar who has read a digit and wonders about its parity.
He would like to know the Certainty he may have that the read digit is even depending
only on the shape of it. He then considers three events:
"the read digit owns a line" ($D_1$=\{1,2,4,5,7\}),
"the read digit is made only of curves without being a circle" ($D_2$=\{3,6,8,9\})
and "the digit is a circle" ($D_3$=\{0\}). He then esteems that it is
\textsf{probable} that the digit owns a line,
\textsf{little-probable} that the digits is made only of curves without being
a circle" and \textsf{very-little-probable} that "the digit is a circle".
He is then interested in the fact that "the digit is even" (B=\{0,2,4,6,8\}).
Evaluating that it is \textsf{little-probable} that the digit is even given that
it owns a line, \textsf{probable} that it is even
given that it is made only of curves, and \textsf{certain}
that it is even given that it is a circle, he concludes,
using the \textit{Total Certainty Formula}
that it is \textsf{probable} that the digit is even.
He wants now to find the degree of certainty that, after having read a digit (and not a
letter or a special character), this latter owns a curve. So he considers the three
following events: (B) "the digit is even" (B=\{0,2,4,6,8\}), (C) "the digit owns a curve "
(C=\{0,2,3,5,6,8,9\}), (D)"the symbol is a digit" (D = \{0,1,2,3,4,5,6,7,8,9\}).
To our scholar, it is \textsf{certain} that the symbol is a digit,
\textsf{very-probable} that the digit owns a curve when the symbol is a
digit and at last that it is \textsf{rather-probable} that the digit is even
when it owns a curve. So, using the \textit{Uncertainty Propagation Rule}
our scholar knows that is between \textsf{probable} and \textsf{rather-probable}
that the digit is even.
\textit{Confident on the distribution of even numbers among symbols,
he decides to go on learning.}
%
\section{THE MODEL}% 3
%
Two operators are defined in ${\mathcal{B}}_{7}$:
a symbolic sum noted
$\mathcal{S}$
and a
symbolic complement
\textsf{n}, both
analogous to the usual
operators of classical Probabilities. The choice of
$\mathcal{S}$ is not unique and we take the T-conorm
associated to Lukasiewicz's implication.
${\mathcal{S}}(v_\alpha,v_\beta)$
is defined by
$v_\alpha+v_\beta-1$ if
$v_\alpha+v_\beta-1 v_1$ and $v_b$=$v_1$,
[$v_2$,$v_a\to v_b$] for $v_a > v_1$ and $v_b$=$v_2$,
\{$v_a \to v_b$\} if $v_a\geq v_b$, $\emptyset$ otherwise.
It is important to note that there is no objective way
to decide which size is the best for a given problem,
and which operator I should be taken.
It is only a matter of choice. Taking M=7 gives more precise results that M=5
but it requires also to choose more carefully I. Let's suppose that we have chosen
to work with 7 degrees of certainty, each degree being related to a linguistic
expression. If we try to solve a problem for which the elementary elements have
a probability of one tenth, we will have
to add some degrees. But there are many ways to do it. The first possibility is
to recreate a scale of degrees having the right number of degrees (that is 10 for
our case). But then, there will be no correspondence between the scale of the
problem and the linguistic scale. So the result will not be expressible in a
linguistic way. On the contrary, if we replace each degree of
${\mathcal{L}}_{7}$
by a fixed number
of degrees (2 for our example), then, each specific degree of the problem will be a
specialisation of a degree of
${\mathcal{L}}_{7}$
For instance \textsf{probable} will give the two
qualifiers \textsf{rather-probable} and \textsf{very-probable}.
We will then be able to match each specific degree of the problem with a
linguistic expression. The result will then always be given precisely with
the new graduation though it may not be very expressive. But it will be expressed
in natural language. The problem of the graduality has already been studied
in \cite{pacd92a} but this is not the method used here. Let's see how
${\mathcal{L}}_{5}$,
${\mathcal{L}}_{7}$,
${\mathcal{L}}_{9}$,
${\mathcal{L}}_{11}$
fit to represent our examples. We may choose any linguistic qualifiers though
the derivation of qualifiers from one M to another may not be explicit.
The correspondence between a scale and another one
is not obvious. For instance, we could use an hierarchic scale of degrees by
refining the leftmost term, but we prefer the last one of Figure 1 which
seems more reliable for our examples. The comparison can be done as follows:
for each scale (that is for each choice of M), we compute the certainty of the
same event. The results are then given in the following Table. One has to be
careful: $[v_3, v_4]$ for M=7 and $[v_3, v_4]$
for M=9 does not mean the same thing. Also, some choices have been made:
$v_6$ for M=7 may be taken as $v_7$ or $v_8$ for M=9. But the reader could check
as we did that even with the other possible choices, the results still hold.
\begin{center}Table 2: Comparison of results for different M\end{center}
{\small
% deuxi\`{e}me tableau
\begin{tabular}{|c|c|c|c|}
\hline
Problem & magician (1) & magician (2)& fire \\ \hline
Rule & \textit{Total} & \textit{Bayes'} & \textit{Uncertainty}\\
& \textit{Certainty} & \textit{Formula} & \textit{Propagation} \\ \hline
M = 5 & $v_3$ & [$v_2$,$v_4$] & [$v_3$,$v_4$] \\ \hline
M = 7 & $v_4$ & \{ $v_6$ \} & [$v_4$,$v_5$] \\ \hline
M = 9 & $v_5$ & \{ $v_8$ \} & [$v_4$,$v_6$] \\ \hline
M = 11 & $v_6$ & \{ $v_{10}$ \} & [$v_4$,$v_7$] \\ \hline
\end{tabular}
}% end small
With all these examples, the results show coherent: the intervals overlap or belong
to the others, but we have never disjoint intervals or isolated values. But, it has
to be noted that for a same value of M, the results are more or less satisfying for
different problems. Whereas there seems to be a better value of M, that is a value
giving a smaller interval which in turn, gives a stricter approximation, it is not
possible to fix and use a unique value of M for all problems, neither to increase the
number
to give a more precise value of the intervals for all cases. This is shown by the
Figure 1 where the rectangles indicate the interval of solution. The scales with
M=5 or M=7 are probably too small, that is, having too few values but, globally,
the most adequate size seems to be M=9. In fact, a cognitive agent, for a
predicative domain, does not tend to use the qualifier under the mean value.
Being positivist, he will choose the event that is the more probable. This means
that he may have to express his
certainty not on an event but on its complementary. It follows that the expressive
power of a scale is related not to the whole set of terms but to the number of
terms above the mean value (which is, by itself, non informative).
\newpage
\begin{center}
%\fbox{
%{\tiny
\begin{picture}(460,200)
\put(150,210){Figure 1: Scales of qualifiers and interval of solutions}
\put(190,195){(fire outbreaks example)}
{\font\ghm=cmr6 \ghm
\put(0,166){M=3}
\put(0,160){impossible}
\put(0,126){M=5}
\put(0,120){impossible}
\put(0,086){M=7}
\put(0,080){impossible}
\put(0,046){M=9}
\put(0,040){impossible}
\put(0,006){M=11}
\put(0,000){impossible}
\put(10,156){\line(0,-1){23}}
\put(10,116){\line(0,-1){23}}
\put(10,076){\line(0,-1){23}}
\put(10,036){\line(0,-1){23}}
%
\put(220,160){probable}% 3
\put(220,120){probable}
\put(220,080){probable}% 7
\put(220,040){probable}
\put(220,000){probable}% 11
\put(230,156){\line(0,-1){30}}
\put(230,116){\line(0,-1){30}}
\put(230,076){\line(0,-1){30}}
\put(230,036){\line(0,-1){30}}
%
\put(430,160){certain}% 3
\put(430,120){certain}
\put(430,080){certain}% 7
\put(430,040){certain}
\put(430,000){certain}% 11
\put(440,156){\line(0,-1){30}}
\put(440,116){\line(0,-1){30}}
\put(440,076){\line(0,-1){30}}
\put(440,036){\line(0,-1){30}}
%
\put(090,120){less-than-probable}% 5
\put(315,120){more-than-probable}
\put(060,080){very-little-probable}% 7
\put(145,080){little-probable}
\put(280,080){rather-probable}
\put(360,080){very-probable}
\put(040,040){frankly-improbable}% 9
\put(106,040){really-improbable}
\put(165,040){little-probable}
\put(255,040){rather-probable}
\put(315,040){really-probable}
\put(375,040){frankly-probable}
\put(106,000){really-improbable}% 11
\put(165,000){little-improbable}
\put(255,000){rather-probable}
\put(315,000){really-probable}
\put(005,-10){frankly-very-improbable}
\put(085,-10){really-very-improbable}
\put(325,-10){really-very-probable}
\put(395,-10){frankly-very-probable}
%
\bezier{200}(130,130)(215,155)(215,155)% 3 -> 5
\bezier{200}(330,134)(240,155)(240,155)
\bezier{200}(123,115)(086,090)(086,090)% 5 -> 7
\bezier{200}(123,115)(160,090)(160,090)
\bezier{200}(335,115)(300,094)(300,094)
\bezier{200}(335,115)(370,090)(370,090)
\bezier{200}(095,074)(075,054)(075,054)% 7 -> 9
\bezier{200}(095,074)(115,054)(115,054)
\bezier{200}(170,074)(183,054)(183,054)
\bezier{200}(290,074)(272,054)(272,054)
\bezier{200}(370,074)(350,054)(350,054)
\bezier{200}(370,074)(390,050)(390,050)
\put(129,036){\line(0,-1){30}}%%%%%%%%% 9 -> 11
\put(190,036){\line(0,-1){30}}
\put(270,036){\line(0,-1){30}}
\put(340,036){\line(0,-1){30}}
\bezier{200}(066,036)(046,-02)(046,-02)
\bezier{200}(066,036)(086,-02)(086,-02)
\bezier{200}(400,036)(380,-02)(380,-02)
\bezier{200}(400,036)(420,-02)(420,-02)
% les cadres
\put(214,110){\line(01,00){175}}
\put(389,110){\line(00,01){020}}
\put(389,130){\line(-1,00){175}}
\put(214,130){\line(00,-1){020}}
%
\put(214,070){\line(01,00){124}}
\put(338,070){\line(00,01){020}}
\put(338,090){\line(-1,00){124}}
\put(214,090){\line(00,-1){020}}
%
\put(214,030){\line(01,00){155}}
\put(369,030){\line(00,01){020}}
\put(369,050){\line(-1,00){155}}
\put(214,050){\line(00,-1){020}}
}% ghm
\end{picture}
%}% tiny
%}% fbox
\end{center}
\begin{tabular}{c}
\\[-2cm]
\end{tabular}
\noindent
\begin{tabular}{c}
\\[0.1cm]
\end{tabular}
So, in a
scale with 7 terms, only 3 are pertinent, which is too restrictive. This fact
is corroborated in psychology
\cite{rbgh90,spwi89} where 5,7... values are
said to be pertinent and linguistically fit to give precise evaluations.
On practical problems, a scale of 11, 13 or
15 can then be used. Fixing the size of the scale is not enough: for
the same M, we have different operators and we have to find the most
adapted operators for our problem.
\textbf{Remark:} Let's compare now our approach to that of Aleliunas,
Darwiche, Darwiche \& Ginsberg and of Spohn. Our concept of Certainty
lead to properties that are similar to those of Coherent State of
Belief (axioms A0-A4) postulated by Darwiche \& Ginsberg
\cite{dagi92}.
More, the T-conorm S that we introduced by axiom P5 stands for the
summation operator of their theorem 1. Their symbolic theory is founded
only on a Propositional Logic whereas ours, based upon a many-valued Predicate Logic, owns
a greater expressive power. In particular, we can, at the same time, represent
and manage the Imprecision and the Uncertainty. Note also that, in their theory,
no link is established between their summation operator and the implication,
an inferential Process of deductive type is not proposed. On the other hand,
our system can, with this link, spread the Uncertainty as we will see in the next
paragraph.
A link with the structure of Aleliunas' Probability Algebra can be
made \cite{xbpo90}. The
operator I (respectively n) taking the place of * (resp. i), the axioms 1-10
proposed in \cite{xbpo90} are verified. Thus, the axiomatics of I (resp. C) leads
to a particular structure of Probability Algebra, finite and totally ordered.
\begin{tabular}{c}
\\[9.60cm]
\end{tabular}
We have added to the theory a
notion of Independence closely linked to the Conditional Certainty.
A link being
made with the implication, we add to the Symbolic extension of Bayes' Theorem
the symbolic rule of Generalised Modus Ponens as second means of propagation
for the Uncertainty.
Our axiomatics of the
Concept of Certainty does not give (in the prepositional case) an
Aleliunas' Probabilistic Logic
\cite{alel88,alel90}. Indeed, it is easy to verify that
the axioms 7-9 are not satisfied by our concept.
Spohn has also proposed a theory allowing to treat uncertain information
\cite{spoh90}. He approaches the concept of Certainty (degree of strength
of the Belief in his theory) in an ordinal way. He especially introduced the concept
of ordinal degree of conditional Certainty. Substituting successively + * and / of
the classical probabilities by his operators min + and -, one then finds again the
results of his theory which thus presents as an ordinal, non probabilistic model
of inductive reasoning. So, the
theory used here presents a similarity with that of Spohn. In both cases, we are
looking for a non probabilistic (in the classical sense) model of the concepts
of certainty and of conditional certainty, even if the objectives are different.
In both cases, one can propose generalisations of classical probabilistic results.
But let's point out several important differences. Spohn's approach being ordinal,
the operators used are that defined in \N.
It is clear that they
differ basically from the operators S, I and n that we have built in the scale of certainty.
The scale of Graduation in his theory is infinitely denumerable whereas ours,
by definition, is finite.
\section{CONCLUSION}% 4
In this paper, we have first studied three different examples to illustrate
the fact that, in situations where the numerical approach is ill adapted,
our model works simply and gives results that conform to the human intuition.
Then, we gave the axiomatics of the operators of a Symbolic Probability
Theory that manage the Uncertainty of the statements of the Natural Language
evaluated in a Qualitative way, exposing the different rules that may be used.
The problem of the choice of the size of the model has then been discussed,
showing that for any value of M, there is an overlapping of the solution of
size M with the solution for another M. Clearly, our theory brings new tools
to Linguistics for the explicit treatment of uncertain statements of the
Natural Language, in particuliar the Conditionals of Language.
%though implementations and choice problems still have to be solved.
\bibliography{ghinfo}
\bibliographystyle{unsrt}
%\begin{thebibliography}%{abcdefg}
% \bibitem{first} G.~Tesauro (1989). Neurogammon wins computer Olympiad.
%{\it Neural Computation} {\bf 1}(3):321-323.
%\end{thebibliography}
\end{document}