WSU Pique Graph Isomorphism Application
This tool analyzes and compares source code structures using graph isomorphism.
It is useful for:
- Code clone detection
- Vulnerability detection
- Software analysis
You can run the application on native Linux or inside a Docker container.
Table of Contents
- Setup (Native Linux)
- Setup (Docker)
- Using the Application
- Command Reference
- Developer Tools
- Troubleshooting
- License
- Contact
- Contributors
Setup (Native Linux)
1. Set Up a Linux Environment
First, you must choose a Linux distribution. We recommend using Ubuntu since thats what this application is tested on, but any Linux distribution should work with proper modifications. Once a distro is chosen, you will need to set up the environment. This can be accomplished a couple different ways.
- The most direct way would to be to install a Linux directly to your hard drive and configure your computer to run the OS.
- An alternative to a bare bones installation would be virtualization. This includes a virtual machine or WSL (Windows Subsystem for Linux) running a Linux distribution.
To install WSL on Windows, run the following in a Windows terminal:
wsl --install
2. Clone the Repository
Open the Linux terminal and then run
sudo apt install git && git clone https://github.com/MSUSEL/wsu-pique-graph-isomorphism.git
This will ensure git is installed and then git clone the repository.
3. Set Up Processing Environment
Open the Linux terminal, navigate to the cloned repository directory, and then run the script to automatically set up the rest of the environment:
./setup_environment.sh
4. Run the Application
In the terminal, run the following command to start the application:
make app
Note: You will likely need to run some of these commands with sudo privileges, in which case you can prepend
sudoto the command.
Setup (Docker)
1. Download Docker Desktop
Go to https://www.docker.com/products/docker-desktop and download the installer for your operating system.
2. Install Docker Desktop
Run the installer and follow the on-screen instructions to install Docker Desktop.
3. Clone the Repository
Open the terminal and then run
git clone https://github.com/MSUSEL/wsu-pique-graph-isomorphism.git
or
Download directly from Github and unzip it.
4. Build and Deploy the Docker Image
Navigate to the directory where the repository was cloned or unzipped, and run the following command to build and deploy the Docker image:
docker compose up
Using the Application
Once you've completed either version of the setup, you can open a web browser and enter the following URL to access the application: http://localhost:8501
In the web browser, you should see the application interface. Using the web interface, you can upload source code files, analyze and visualize the results.
Key Definitions:
- Motifs are patterns of code that represent common programming constructs or idioms. They are used to identify and analyze similar code structures across different codebases.
- Hosts are the specific instances of code in a codebase that may exhibit similar behavior or structure to the defined motifs.
Web Interface Example
- First, you will need a single file of code. This file can be any file type thats supported by Joern. For the example, we suggest using the following "Hello World" C++ code.
#include <iostream>
int say_hello() {
std::cout << "Hello, World!" << std::endl;
return 0;
}
int main() {
return say_hello();
}
Note: Not all uploaded files will produce a processed file
- Next, with the web interface up, you will upload the source code as both a motif and a host (The motif instance name is arbitrary, but for this example, we use "Hello").
- Once the file has been uploaded as both a motif and host, you should see the file in the processed files tab.

- You can then use the application to analyze and visualize the results, and then download those results.

Here, we can see that the "Hello" motif has been successfully matched with the host code.
Command Reference
The application provides several endpoints for interacting with the application programmatically:
make main
Processes Motifs and Hosts, creates CPGs, and finds structural matches.
Output: results.json
make evaluate
Will create a PNG visualization of comparison results JSON file for a more straightforward evaluation.
Note: It contains similarity measures that do not consider the set threshold of incidental similarity necessary to impact quality negatively.
make unpack
Will unpack files from the DiverseVul data set and load them directly into the application.
make proto
Will take motif files and converge them into the proto profiles to reduce the number of files processed whenever processing hosts.
Developer Tools
For all possible prebuilt commands, please refer to the Makefile or for a more in-depth look at the processing, some good starting points are:
Troubleshooting
In most cases, when setup or installation issues arise, checking the logs and error messages can provide insight into the problem. Common issues include missing dependencies, incorrect configurations, or permission errors. If you encounter issues, consider the following steps:
- Review the terminal output for any error messages.
- Ensure all dependencies are installed and up to date.
- Check file permissions and ownership.
- Consult the documentation for any specific configuration requirements.
If all else fails, repeat the installation process from the beginning, ensuring that all steps are followed correctly.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contact
For any inquiries or issues, please contact MSU Engineering Lab