A Cognitive Approach for Community Detection

complex - A Cognitive approach to Community Detection

Detecting communities is a task of great importance in many disciplines, namely sociology, biology and computer science, where systems are often represented as graphs. Community detection is linked to clustering of data: many clustering methods establish links among representative points that are nearer than a given threshold, and then proceed in identifying communities on the resulting graphs.Given a graph, a community is a group of vertices “more linked” than between the group and the rest of the graph. This is clearly a lousy definition, and indeed, or a connected graph, there is not a clear distinction between a community and a rest of the graph. In general, there is a continuum of nested communities whose boundaries are somewhat arbitrary: the structure of communities can be seen as a hierarchical dendogram. A community detection algorithm should therefore return different “views”, according to the value of some control parameter. Due to the arbitrariness of the definition, such “views” are relevant if they are not crucially dependent on the precise value of the parameters, i.e., if communities appears as “plateaus” when varying the control parameter.We want to explore the behavior of exploratory methods inspired to human heuristics, in the hope of exploiting the “social knowledge” of human mind and also for developing more “natural” human-computer interfaces. Clearly, we do not pretend to simulate the real human behavior, but only to study the behavior of simplified models inspired to it. In particular, we deal with the task of identifying communities in an existing graphs, using a local algorithm and not relying on global quantities like betweenness, centrality, etc. An individual is simply modeled as a memory and a set of connections to other individuals. We explore two different approaches: in the first, information about neighboring nodes if propagated and elaborated locally, but the connections do not change. In the second approach, information is not elaborated while it is the wiring that is varied with the result of directly connecting to a “central node”. Both processes can be considered implementations of the availability heiristic, which is simply the assumption that the most vivid or easily recallable information give an accurate estimate of the frequency of the related event in the population. In both approaches, the “learning” (nonlinear) phase is modeled after competition in chemical/ecological world. This can be considered an implementation of the anchoring heuristics, in which the judgment or the action is dominated by one or very few pieces of information, the most relevant ones.

We presented here only preliminary investigations, a more detailed investigation on the statistical properties of the algorithms is underway. Primate communities are strongly social, and therefore we have developed sophisticated tools for detecting the various degree of relationship. These tools are quite slow, since in primate a large part of time is devoted to exploration and communication, and require large computational efforts and large brains.

There is not a unique definition of a community, so an exploratory algorithm, like the one that humans have presumably developed during their evolution, should present different clustering for different values of the parameters, or for different iterations. We investigated two methods, one based on pure information propagation and the other based on rewiring. The nonlinear part of these method, responsible for the actual elaboration of information, is inspired to a chemical/ecological competition model. The results that we obtained are promising.

The method under investigation is not competitive with respect to others, but it provides a natural “scanning” of the various clustering levels. Moreover, our method can be naturally applied to weighted graphs.