Chapter 1 Introduction
The development of this work has been one of unexpected surprises.
At first the task seemed simple. To simplify the process of specifying
what to display based on incoming alarms. Since the problem of alarm correlation
must be encountered all over the world I expected there may be many
approaches already documented. Unfortunately the commercial nature
of such systems made it difficult.
The problem with the system in use was that it was
-
difficult to specify rules,
- difficult to get new rules to work.
Of course this is a difficult problem to solve for the uninitiated.
To reverse engineer a package in order to get it to work better is daunting to
say the least. To sort out how to put it to best use seemed the better way.
The first way to tackle this problem seemed to be to analyse what
the operators of such systems wanted. What sort of things did they
specify when they asked for a correlator. Did they just give a general
command, to make less alarms appear on their screens, or did they give
ID numbers for the alarms that were to be suppressed?
Efforts in this direction were found to be fruitless.
1.1 Artificial Intelligence
Fortunately the area of Artificial Intelligence has already come across
such problems. One of the first tasks computers were aimed at was to be
oracles of wisdom for areas that required lots of specialised knowledge.
The problems they encountered were:
-
Getting the specialised knowledge into a formal structure
- Representing the knowledge
- Fixing inaccuracies in the coded knowledge
- Adding new knowledge
- Updating old knowledge
The term Knowledge Acquisition Bottleneck was a common phrase.
To some it was the only thing holding back the Knowledge Based Systems
from blossoming into efficient and profitable success stories.
Admittedly, there were indeed some remarkable success stories in the area of expert
systems. But, once they were built they were found to be expensive to maintain.
Thus the initial excitement with expert systems has faded, and now there remains
much cynicism as to how a rules based system can be made truly maintainable
and accurate.
1.2 Production Rules
It turned out that the existing alarm correlator was built using a production rules
system that had been trying to deal with this problem for years. The underlying
concept seemed to be troublesome so that extra software was needed for Truth Maintenance,
the task of checking a knowledge base for logical consistency. Different
scheduling algorithms were used to maximise efficiency. Debugging tools
were available, as in programming shells. Unfortunately the whole thing was so complex
that a software engineer was needed to do any work on the knowledge base.
The old problem of passing knowledge from the domain expert to the computer
engineer was seemingly insurmountable.
1.3 Ripple-Down Rules
The next revelation in my research was the new (albeit 10 year old)
research happening in the area
of Ripple-Down Rules(RDR). Here was a rules based system that claimed to be easy to
maintain. Not only that, but there was a working application of 2000 rules in use
in St. Vincent's hospital that had been built and maintained, not by a team of dedicated
software engineers, not even just by one software expert, but directly from the
pathology lab. The addition of rules was not a big project, but a routine 15 minutes a day
exercise. The system's accuracy was at 99% and getting better every day!
Of course there were bound to be differences involved in taking a
system used for medical diagnosis and applying it to filtering
telecommunication alarms. The first seemed to be the fact that the
medical diagnosis takes one group of attributes, the analysis of
one human being. The alarm correlation takes in a variable number
of groups of attributes. It seemed like extending the medical
diagnosis to include the entire city of Sydney to check for
potential epidemics.
Another problem was the fact that a telecommunication system is changing all the time.
New components are added, new networks extended, old components removed.
The human body is fairly static in comparison. People don't just start growing
extra limbs. The central nervous system doesn't decide to extend outside the usual
limits of the body.
The problem of diagnosing thyroid conditions is the same over time,
although of course there may be differences in expert opinions.
The interesting thing is that a change in the opinions of the lab expert is reflected
in the knowledge base over time. The ease with which rules can be refined
means that there is less of a problem in changing the knowledge base
as may have been expected.
The first step seems to be to get a knowledge base started. After this the expert
can refine the rules, until the system reflects what you want.
This first step is tougher than you think. The Garvan RDR system was put into the
lab with 200 rules already in place. They had the benefit of experience with a
previous rules based expert system (Garvan-ES1) to provide the initial rules.
With a telecommunication system, which is renowned for its irregularity and size,
how do you start off a knowledge base?
1.4 Qualitative Modelling
A similar task was faced by a team in Ljubljana, Yugoslavia (now Slovenia). They had the
task of modelling the human heart to identify arrhythmias in electrocardiograms(ECGs).
There was no classical model of the heart to go from. No one knew for sure how
to quantify the effects that one part had upon the other. There was a plethora of medical
literature to dig through and experts took years to learn these things.
They turned to deep qualitative models to give a model of how different problems
produced the ECG features. The interpretation of an ECG is a complicated task, since
one feature of an ECG can point to a variety of explanations. A qualitative model
is like an abstract description of a system. Instead of numerical equations they employ simple
relationships like `when this thing increases, that thing decreases'.
Of course there are different levels of abstraction. Ideally the higher, more general levels
should be able to make use of the more detailed levels to check themselves for consistency.
Once the model is made then it can be used to generate the cases. Once you have a set of cases
you can apply Machine Learning to give you a set of rules. Once you have a set of rules
you can start putting them into use and getting a domain expert to refine them.
Before you know it you have a full fledged expert system!