How to use graph-tool on a mounted drive using WSL

- by Tobias Gårdhus & Esben Lemminge

Two of the biggest pains of being a data scientists is having problems installing a specific package or tool, and having problems with accessing data. This post will walk you through the process of installing the Python network library graph-tool on a Windows machine using Windows Subsystem for Linux (WSL), and how to run a jupyter notebook in a network drive.

The blog post will as such walk you through the following steps:

  1. Install Ubuntu with WSL
  2. Mount a network drive from your local machine to the Ubuntu
  3. Install and configure miniconda as the python environment and package manager on Ubuntu
  4. Install graph-tool (and the hSBM-TM package)

Motivation

Many a battle have been fought trying to make the graph-tool library accessable on Windows OS. Two things make the library attractive for data scientists working with network graphs in Python: 1) The core data structures and algorithms are implemented in C++, boosting the performance significantly when working with large networks compared to other libraries. 2) The library provides many advanced methods for manipulating and analysing networks, including sofisticated statistical methods such as Stochastic Block Modelling (SBM).

The library is however also notoriously difficult to install on a machine with Windows OS. Having done it once on docker, we recently faced the challenge of making it work on a another Windows machine, and opted for the Linux installation this time, utalizing the possibility of installing Ubuntu with WSL. To top it off, we had to conduct our analysis on a network drive, ensuring compliance with GDPR.

Install WSL

Install Ubuntu as a WSL through the Windows Store: https://www.microsoft.com/en-us/p/ubuntu-2004-lts/9n6svws3rx71

After installing Ubuntu, and running it you will see a new terminal. This is how you interact with the operating system, and if the first time you run it you will be prompted to set a user name and password.

Mount the network drive

Per default, your systems local drive is mounted to WSL in the location /mnt/c. Through this you can acess your local files.

To be able to connect to a network drive which is already connected to the Windows system, we have to do two simple steps. First of all we have to create a folder in which the mounted content will be.

In the Ubuntu terminal, create a new folder in the mnt (mount) folder.

sudo mkdir /mnt/S

Mount the drive from your local machine to WSL. You have to have opened the S-drive in windows file explorer first for it to work on Ubuntu.

sudo mount -t drvfs S: /mnt/S

You have to mount the S-drive every time you start a new WSL instance.

Install graph tool

I found the easiest way to install the graph-tool (https://graph-tool.skewed.de/) library on linux to be through conda.

To install conda in WLS, download the miniconda installation file with the following one line

curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"

Install the file and remove it

# Install
bash Miniconda3.sh

# Remove
rm Miniconda3.sh

Close the Ubuntu terminal, and open cmd.exe (Windowns Command Prompt)

Shut down WSL in cmd

wsl --shutdown

Open the Ubuntu terminal and Update Conda

conda update conda

Create a new conda environment and install graph-tool and jupyter

conda create --name gt -c conda-forge graph-tool jupyter pandas

Not all the dependencies were installed correctly, so I had to install some of them manually.

sudo run apt-get update

sudo apt install libgtk-3-0

You can activate the newly created conda environment with

conda activate gt

Install HSBM-TM functions

In this section we will walk through the process of downloading the extended Topic Modelling utility for graph-tool’s hierarchical stocastich block models.

To make the HSBM-TM module available in the environement, we have to install it in the folder in which the conda environment looks for packages. The path (should only) vary depending on the environment name and python version.

cd ~/miniconda3/envs/ENVIRONMENT/lib/pythonVERSION/site-packages

In our case, it was.

cd ~/miniconda3/envs/gt/lib/python3.10/site-packages

If you cannot find the path, try starting python, import a module eg. graph_tool and call graph_tool.__file__

When you are inside this folder, you can download the hSBM package through the following command.

git clone https://github.com/martingerlach/hSBM_Topicmodel.git

Set up jupyter notebook

We have to make a few changes to the way jupyter notebook works in order for it to properly run on the windows system running the WSL.

First, generate the notebook configuration file

jupyter notebook --generate-config

Then open the file

nano ~/.jupyter/jupyter_notebook_config.py

Uncomment (remove the “#”-bit at the start of the row) the following line and set to False

c.NotebookApp.use_redirect_file = False

For convenience, also consider uncomment and change the default port through the following line

c.NotebookApp.port = 8889

To exit the file press Ctrl+X, press Y (yes) to save, and then Enter to exit

Then, add your windows browser to the following file

nano ~/.bashrc

Scroll to the bottom of the file and add the following

# Specify the path of the windows browser
export BROWSER='/mnt/c/Program Files (x86)/Google/Chrome/Application/chrome.exe'

Use the package

Activate the conda environment

conda activate gt

Open a notebook (If you did not change the default in jupyter_notebook_config.py, manually specifying a different port than usual may be helpful, so you can also have a notebook server running on windows at the same time. If you did change the default you can leave out the –port 8889 argument)

jupyter notebook --port 8889

Import the package from the folder like so:

from hSBM_Topicmodel.sbmtm import sbmtm

(Examples of usages can be found https://github.com/martingerlach/hSBM_Topicmodel/blob/master/TopSBM-tutorial.ipynb)

Deactivate a conda enviroment with

conda deactivate

Extra

If you want to make your life easier, you could make a bash file in your desired folder, which automates activating your environment and mounting the specified webdrive.

nano setup.sh

And then paste the following

#!/bin/bash

# Activate conda environment
if [ $CONDA_DEFAULT_ENV == "gt" ]
then
    echo "Environment is already activated"
else
    echo "Activating environment"
    source activate gt
fi

# Mount drive
if grep -qs "/mnt/S " /proc/mounts
then
    echo "Drive is already mounted"
else
    echo "Mounting drive"
    sudo mount -t drvfs S: /mnt/S
fi

# Change to project folder

cd /mnt/S/Name_of_project_folder/

Run the file with

source setup.sh