Regionalization of Social Interactions and Points-of-Interest Location Prediction With Geosocial Data

Traditional methods for studying the activity dynamics of people and their social interactions in cities require time-consuming and resource-intensive observations and surveys. Dynamic online trails from geosocial networks (e.g. Twitter, Instagram, Flickr etc.) have been increasingly used as proxies for human activity, focusing on mobility behavior, spatial interaction, and social connectivity, among others. Social media records incorporate geo-tags, timestamps, textual components, user-profile attributes and points-of-interest (POI) features, which respectively address spatial, temporal, topical, demographic, and contextual dimensions of human activity. While the information contained in social media data is complex and high-dimensional, there is a lack of studies exploiting the combined potential of their information layers. This article introduces a framework that considers multiple dimensions (i.e. spatial, temporal, topical, and demographic) of information from social media data, and combines Geo-Self-Organizing Maps (GeoSOMs) in conjunction with contiguity-constrained hierarchical clustering, to identify homogeneous regions of social interaction in cities and, subsequently, estimate appropriate locations for new POIs. Drawing on the discovered regions, we build a Factorization Machine-based model to estimate appropriate locations for new POIs in different urban contexts. Using geo-referenced Twitter records and Foursquare data from Amsterdam, Boston, and Jakarta, we evaluate the potential of machine learning techniques in discovering knowledge about the geography of social dynamics from unstructured and high-dimensional social web data. Moreover, we demonstrate that the discovered homogeneous regions are significant predictors of new POI locations.