Predicting Arousal

As an exploratory project further work was undertaken with the initially collected data set which focused on a potential approach to predict arousal levels based on a specified geographical location. Having created a database of the users movements with their EDA arousal levels this data was analyzed and the gradient of each peak and trough for their arousal was computed. Having this metric for each of the recorded paths it was then possible to gather the geographic location where the highest gradient was observed.

Taking the longitude and latitude of these EDA events it was then possible to geo enrich the original data by gathering external sources of information which are related to this location. In this basic example the nearest business to EDA events location was identified through the use of the Yelp business directory. Having identified the nearest possible business it is naively assumed that this may have some effect on the overall environment and thus the users EDA response. With this assumption it is then possible to extract user reviews for this business and use this textual description as a further metric for understanding what within the environment may be causing the users EDA response.

Once this review data has been collected it is then possible to undertake basic data mining techniques such as frequency analysis in order to understand the connections between the users reviews and thus also what could potentially could be part of the stimuli that led to the EDA response.

Through this approach the highest gradient change for 8 of the paths were collected and the associated geographical location extracted. These were then used against the Yelp business directory to gather user reviews associated with this location as illustrated below. This produced a text corpus of over 10 thousand words which were then pre processed to remove common irrelevant words before being used as input for frequency analysis.

Text Mining

Through this process the following tag clouds were produced which illustrates potential words which may be associated with the arousal event.

Tag Cloud - Highest Value

Tag Cloud - Highest Gradient

Having this text based source it is then potentially possible to use techniques such as the naive bayes classifier to apply this against other locations and the reviews associated with them. This would allow for the creation of adaptive maps which could highlight locations where it is expected that higher arousal levels would be produced.

This predictive approach to arousal is one area which could be further explored focusing on more effective methods for both the identification of arousal events and filtering of associated geo enriched data. The current approach demonstrates the possibility of this technique but due to the naive assumptions that it makes cannot be be used to conclude any definitive connections.