The problem of designing a computer
program that can recognize and respond to human emotions is difficult
for several reasons. Emotional information comes in many
different forms, and computers, unlike humans, are not natively
equipped to process and understand this information. This means
that either
humans must explicitly
provide information about their emotional state, or they must somehow
enhance their innate emotional signals so that computers can
successfully receive them. Both of these techniques have
drawbacks. In the first case, the human must expend time and
energy to tell the computer something that, to the human, is
obvious. In the second case, the human must generally wear a
piece of gear or must have access to a specific piece of hardware in
order to provide the emotional signals.
However, there are many sources of information that the human must
provide in order to communicate with others that the computer might
'eavesdrop' on. This includes face-to-face, telephone, and
increasingly, email communication. In these
cases, the computer should be able to unobtrusively pick up cues about
the emotional state of a user, and build a model of him or her, as well
as those that he or she regularly interacts with.
For this project, I wanted to show that it was possible to extract
useful information about the user's current emotional state:
- starting from a small amount of built-in knowledge
- via tasks that they already perform
- without forcing the user to do extra work to provide the
information
- and without a lot of heavyweight processing machinery or
non-standard hardware
I am not claiming that my approach can extract as much or as useful of
information as efforts without these constraints. I am simply
attempting to see what can be done under these relatively strict
conditions. The easiest way to achieve the above seemed to be in
the domain of email processing.
Empaceptor performs a series of
operations on the text it is given to build up a set of
associations. Empaceptor makes use of two main packages to do its
text processing: the
General
Architecture for Text Engineering (GATE) package, and
WordNet. GATE
is a very powerful and full-featured set of tools for performing
various kinds of information extraction, among other things. It
provides word tokenizing, sentence splitting, part-of-speech tagging,
and entity extraction out-of-the- box, and it can be easily extended to
extract other kinds of information. The capabilities of WordNet
have been described in depth in
many
other
sources.
The main premise of the Empaceptor text processing is simple: words
that occur in the same sentence as the base set of emotional keywords
should take on a partial meaning of the nearby keyword.
The more often a word occurs in one emotional context, the stronger the
association should be, and a single word may take on many different
associations.
Before text can be processed, some setup is required. An editable
list of emotional keywords is loaded. Here is the
base set of emotions. Each keyword is
looked up
in the WordNet dictionary to retrieve the associated synsets.
This allows for much broader coverage than simply using the
original keywords to search. The initial set of 68 expands
to nearly 200 synsets. Once these are loaded, text can be
processed as either a document to be learned from, or as a document to
be scored for emotional content. Here is how text is processed in
learning mode:
- The document, as a Java String object, is handed to the the
GATE text processing system.
- GATE returns a Document object which contains a set of
Annotations of different types such as Token, Whitespace, Sentence,
Person, etc.
- The Empaceptor code walks this Document, finding all of the
Tokens that represent words (as opposed to punctuation or numbers).
- Each word Token is passed to WordNet to retrieve its synsets
(adjectival only).
- The synsets of the Token are compared with the pre-loaded
emotional synsets
to see if there is overlap. If so, every word in the Sentence
(whose boundaries are determined by the GATE annotations) is marked has
having that emotional content. The occurrence count is also
incremented for each of the Tokens. If more than one emotional
word occurs in the sentence, all of the words are counted for each
emotional occurrence.
- If none of the Tokens in a Sentence belong to the emotional
synsets, then the words are simply marked as having occurred.
The learned associations are stored in a simple
XML format so that the server can be started and stopped without losing
previous associations. It can be reset by simply discarding the
XML file.
Later in the project, this was slightly modified to note if a given
Token was also marked as a Person annotation. The
assumption was that if the name of person occurs in the same sentence
as an emotion, it is more likely that the emotion is related to that
Person. Therefore, Person tokens are counted as twice as likely
to be emotional when occurring with an emotion, as compared to a
regular token.
With a the trained system, it is possible to determine the emotional
valence of any word:
- If a word did not occur in the training corpus, it is given a
score of 0.0 for every emotional category.
- For any other word, the score is the number of emotional
occurrences divided by the total number of occurrences.
Here is an example, assuming that the emotional list contains only
'happy'.
I am happy about my new car. I
put it in my garage.
Here is the resulting
table:
Word
|
Happy Weight
|
Word
|
Happy Weight
|
I
|
1/2 (.5)
|
car
|
1/1 (1)
|
am
|
1/1 (1.0)
|
put
|
0/1 (0)
|
happy
|
1/1 (1.0)
|
it
|
0/1 (0)
|
about
|
1/1 (1.0)
|
in
|
0/1 (0)
|
my
|
1/2 (0.5)
|
garage
|
0/1 (0)
|
new
|
1/1 (1)
|
|
|
One feature of this system, which is beginning to be evident in
this table, is that common words such as 'I' are used so frequently
that they lose any emotional association.
To score an entire document:
- The document is chopped up in a quick and dirty fashion (GATE is
not used) into a set of tokens.
- Each word token is passed to the scoring algorithm above, once
for each of the available emotional synsets.
- The score for each token for each emotion is aggregated, and then
divided by the number of tokens, in order to normalize for the length
of the document. This gives a final rating for the document for
each of the available emotions.
Interacting with Empaceptor - mbox
import
Importing of mbox files is relatively straightforward. A
reference to the file itself is passed to Empaceptor. Empaceptor
uses a library from the GNU project that implements one of the
Java Mail API
service provider interfaces. It can parse the mbox file and
returns an array of Message objects. These Message objects must
be further parsed to throw away much of the header information,
basically everything except the Subject: line. The rest of the
data is formatted as a String and handed to the learning algorithm
above. This service can be invoked from the EmpaceptorGUI
described below.
Interacting with Empaceptor - email
service
In order to invoke the email processing service, Empaceptor must be
used with an email program that supports the execution of arbitrary
shell commands as filter steps. The Empaceptor Java files must be
placed in a standard location that can be referenced from an Email
client. When an incoming message is received, it starts up the
email client service class, and pipes the message to Standard In, along
with the desired emotion for scoring (as an argument to the email
service class). The email service reads the desired emotion
argument, and
the email off of the incoming pipe, and then opens a standard TCP/IP
socket
to Empaceptor. The client service sends the information, and
waits for a response of either "yes" or "no" from Empaceptor.
Empaceptor
uses the scoring algorithm described above to determine the strength of
each emotion in the text. It then looks up the value for the
desired emotion, and compares it with a configurable threshold. It
returns "yes" if the value is above the threshold. Empaceptor
does not attempt to learn from messages coming through the email
service.
If the client gets back a "yes", it exits with code '1',
otherwise it exits with code '0'. If something goes wrong, it
exits with code '15' to avoid confusion. The email program can
then use the result to take an action, such as tagging the message with
a color. The server that accepts incoming requests from the email
service clients can be started and stopped from the EmpaceptorGUI
described below.
Interacting with Empaceptor - SMTP
server
Empaceptor will open a port to allow incoming SMTP connections.
An email program simply needs to be told to send its outgoing mail to
that port. The SMTP server is a slightly modified version of the
jes code
developed by Eric Daugherty. It is modified only to send any
incoming message to Empaceptor before sending out to the world.
It sends the content of the message to both the learning algorithm, and
the scoring algorithm in order to gather historical data about trends
(currently unimplemented). The SMTP server can be started and
stopped from the EmpaceptorGUI described below.
The EmpaceptorGUI
The EmpaceptorGUI is the user interface to the Empaceptor system.
When launched it opens a small window on the desktop with a menu to
access a set of tasks. It can do the following:
- Open an mbox file to process
- Start & stop the server to handle incoming email client
requests.
- Set the threshold for what qualifies as having emotional content
(default is 0.1)
- Start & stop the SMTP server
- Configure the list of available emotions (although old messages
will not be reprocessed to check for newly added emotions)
- Open a test box for sending messages to Empaceptor without
needing to use email. Text entered will be scored, and the
results shown in a table.
Usage Story & Shortcomings
I attempted to use Empaceptor to see how it would perform on my daily
volume of email. I began by parsing my Sent mail file,
approximately 440 messages, as the initial training data. I
ran the Empaceptor email service and SMTP server, and configured my
email client,
Evolution,
to invoke the service looking for "happy" messages, and set the success
threshold at 0.1. I set the filter rule to set the message color to
purple if the Empaceptor email service returned a positive result (a
return value of 1) so that the results would be easy to discern at a
glance.
The system was too slow at first, so I reconfigured the message
processing to make scoring very fast, and learning happen
asynchronously in a separate thread so that the email process would not
have to wait. This made the system usable for day-to-day
use. I ran the system for approximately 5 days to see what it
thought were happy messages. Some interesting notes:
- It did a decent job. I couldn't come up with a metric for
determining 'accuracy', but it erred too much toward false
positives, especially on short messages.
- 440 messages was not really enough training data. It was
too biased toward certain words such as my name.
- It was easily fooled by idiomatic usage of emotional terms.
I had an email conversation about putting together a group to go to
Happy Hour, and it marked all of those messages as 'Happy'.
The same was true of one-liners such as "I'd be happy to take care of
that".
Future Work
As with any software project, there are
many features that did not make it into this prototype. They
include:
- The ability to look at the last 10,100,1000 messages to see
trends. What emotions are the strongest? What is gaining
strength? What is losing strength?
- More sophisticated text processing. This might include a
deeper parse of the individual sentences, or making more use of the
GATE Annotations to extract more of the structure, giving fewer false
positives.
- Take advantage of recency. Emotional associations should
weaken over time, so that the most recent usages are more indicative of
the user's current state of mind.
- Get it out into the world! I am going to make it available
under
the GPL (since it relies on GPL software) for download, if others want
to try to interface with it. I need to figure out all of the
licensing issues before I make a full release.
Related Work
Liu
et al (2003) have done quite a bit of inspiring work on
affective text processing. My work differs from their email
processing work in that I am using a low-knowledge approach, as opposed
to their use of common sense reasoning, and I am attempting to
interface with existing end-user tools, while they produced a new email
client in which the analysis takes place.
The Eudora mail client uses 'Chili Peppers' in its
Mood Watch
feature to rate the level of vitriol in an incoming
message. It uses simple keyword spotting to find vulgarity,
racial epithets, and other strong offensive language. While I
share in the spirit of making affective computing tools available with
little support from the user, my approach is not aimed necessarily at
scoring of emails (although that is a feature), but more on learning
the user's associations between words and evoked emotions, albeit in a
very simple way. I am attempting a knowledge-based approach that
is somewhere in between the common sense reasoning of Liu, and the
keyword spotting of Mood Watch.
Thanks
I would like to give particular thanks (as I often do in these rapid
development projects) to the open source community, and the wealth of
tools that are available. Especially to the developers of GATE,
jes, and the GNU classpath, classpathx, and inetlib libraries.
I would also like to thank Prof. Picard for her feedback and support.