Graph similarity estimation

Encadrants : 

Occurrences : 


Nombre d'étudiants minimum: 


Nombre d'étudiants maximum: 


Nombre d'instances : 


Faisable à distance: 



Graph isomorphism refers to the problem of deciding whether two graphs are identical. It has many applications in various domains, ranging from biology to cybersecurity. While NP-hard, this problem can be solved in most cases by simple heuristics, like Weisfeiler-Lehman's algorithm. This algorithm turns out to be useful not only to solve the problem of graph isomorphism but also to derive a distance between two graphs or to propagate labels in the context of semi-supervised learning.

Algorithm [1]

The algorithm works as follows :

  • Initialize all nodes with the same color.
  • Assign a hash to each node base on his neighbourhood "3 blue neighbours and 1 red neighbour".
  • Assign the same color to nodes with same hash.
  • Repeat until convergence.


Any working algorithm could be included in the scikit-network [4] library. This would involve testing those algorithms on real-world large-scale data.


The tasks for this proposal include:

  • familiarising with the scikit-network API
  • implementing the Weisfeiler-Lehman algorithm [2] for graph isomorphism
  • implementing the similarity measure from Shervashidze et al. [3]
  • testing those algorithms on relevant data

The reference language is Python, with emphasis on the SciPy and NumPy libraries.