Jean-Charles Lamirel (LORIA, CNRS/Univ. Lorraine), 19 janvier 2018
The Feature F-measure is a statistical feature selection metric
without parameters that showed good performance for classification,
cluster labelling or even for clustering model quality measurement. In this
paper, we propose to evaluate its use in the context of real-world graphs
and their community structure to benefit from its parameter-free system
and its well-evaluated performance. We therefore study on realistic
synthetic graphs the correlations between the Feature F-measure and
certain centrality measures, but especially with measures designed to
characterize the community role of nodes. We show that this measure is
linked to the centrality of the nodes of the network, and that it is particularly
adapted to the measurement of their connectivity with regard to the
structure of communities. We also observe that the usual measures for the
detection of community roles are strongly dependent on the size of the
communities whereas the ones we propose are by definition linked to the
density of the community, which makes their results comparable from one
network to another. This therefore offers the possibility to revise the results
obtained with classical measures regarding leadership in scientific
communities, suppressing the attraction bias which could be due to the
single embedding into big communities. This also offers the possibility of
applications such as the temporal monitoring of the structure of the
communities. Finally, the selection process applied to nodes allows for a
universal system, contrary to the thresholds previously established
empirically for the establishment of community roles.