Encadrants :
Occurrences :
Nombre d'étudiants minimum:
Nombre d'étudiants maximum:
Nombre d'instances :
Domaines:
Context
Graph isomorphism refers to the problem of deciding whether two graphs are identical. It has many applications in various domains, ranging from biology to cybersecurity. While NP-hard, this problem can be solved in most cases by simple heuristics, like Weisfeiler-Lehman's algorithm. This algorithm turns out to be useful not only to solve the problem of graph isomorphism but also to derive a distance between two graphs or to propagate labels in the context of semi-supervised learning.
Algorithm [1]
The algorithm works as follows :
- Initialize all nodes with the same color.
- Assign a hash to each node base on his neighbourhood "3 blue neighbours and 1 red neighbour".
- Assign the same color to nodes with same hash.
- Repeat until convergence.
Application
Any working algorithm could be included in the scikit-network [4] library. This would involve testing those algorithms on real-world large-scale data.
Project
The tasks for this proposal include:
- familiarising with the scikit-network API
- implementing the Weisfeiler-Lehman algorithm [2] for graph isomorphism
- implementing the similarity measure from Shervashidze et al. [3]
- testing those algorithms on relevant data
The reference language is Python, with emphasis on the SciPy and NumPy libraries.
References
[1] https://blog.smola.org/post/33412570425/the-weisfeiler-lehman-algorithm-and-estimation-on
[2] https://arxiv.org/abs/1101.5211
[3] https://people.mpi-inf.mpg.de/~mehlhorn/ftp/genWLpaper.pdf