MEMELink
MEMELink
MEMELink
I won’t be able to assist you with coding, debugging, installation, setup or experiments. The provided files
are "as is." I've done my best to include all the info you'll need. I trust that the Thesis and this document will
guide you well for your experiments. Best of luck!
1. The LFR synthetically generated benchmark networks (section 3.3.3 of my PhD Thesis), you will find
the Generators at the section (LFR benchmark graphs) here: https://www.santofortunato.net/resources
2. The NMI (normalised mutual information), the onmi executable file in the folder, comes from here:
https://github.com/aaronmcdaid/Overlapping-NMI (usage examples are there as well as per publications about
the NMI), you can download and compile the executable with GCC.
3. The source for the “organic” datasets are (although I left some in the folder as well, you may want them
in a different format):
a. http://www-personal.umich.edu/~mejn/netdata
b. https://deim.urv.cat/~alexandre.arenas/data/welcome.htm
1. Make sure that you have Java properly installed in your computer
2. Extract the contents of the .zip file in a folder of your preference. It should looks like this, the
MEMELINK.jar goes along a MEMElink folder with datasets, parameters, etc.
3. Run the .jar file with the following command.
The visualisation tool used is the GraphStream Java Library, you can find it here:
https://github.com/graphstream
As you run the algorithm, the output will be saved in the dataset folder, for instance. If in the parameters you
chose to run 5 times,
file:MEMELink/datasets/karate/karate.net
runs:5
verbose:false
display:true
stats:true
popSize:100
maxGens:100
seedtype:4
There will be 10 output files in the dataset directory, two for each run, the main one is without any post
processing, and the _post files are with post processing.
The format of these output files is a community per line. For instance, in this output for the karate network:
2 1 3 4 5 6 7 8 9 10 11 12 13 14 17 18 20 22 28 29 31 32 33 15 16 21 23 34
26 24 25 28 32
30 24 27 33 34
33 19 34
There are four communities, each entry in the line represents the nodes in that community. In this example the
community in the fourth line has nodes 33 19 34. These are the files used to compare the results using the
NMI. Additionally, there will be a fitness.txt file with the fitness for each run.
For the Xie synthetic networks, use the GT.txt (ground truth) files to compute the NMI against the generated
MML files, eg: ../../onmi GT.txt MML_1.txt There is an automate_onmi.sh that makes running in
batches easier.