Skip to main content

WSU Pique Graph Isomorphism Application

This tool analyzes and compares source code structures using graph isomorphism.
It is useful for:

  • Code clone detection
  • Vulnerability detection
  • Software analysis

You can run the application on native Linux or inside a Docker container.

Table of Contents

  1. Setup (Native Linux)
  2. Setup (Docker)
  3. Using the Application
  4. Command Reference
  5. Developer Tools
  6. Troubleshooting
  7. License
  8. Contact
  9. Contributors

Setup (Native Linux)

1. Set Up a Linux Environment

First, you must choose a Linux distribution. We recommend using Ubuntu since thats what this application is tested on, but any Linux distribution should work with proper modifications. Once a distro is chosen, you will need to set up the environment. This can be accomplished a couple different ways.

  • The most direct way would to be to install a Linux directly to your hard drive and configure your computer to run the OS.
  • An alternative to a bare bones installation would be virtualization. This includes a virtual machine or WSL (Windows Subsystem for Linux) running a Linux distribution.

To install WSL on Windows, run the following in a Windows terminal:

wsl --install

2. Clone the Repository

Open the Linux terminal and then run

sudo apt install git && git clone https://github.com/MSUSEL/wsu-pique-graph-isomorphism.git

This will ensure git is installed and then git clone the repository.

3. Set Up Processing Environment

Open the Linux terminal, navigate to the cloned repository directory, and then run the script to automatically set up the rest of the environment:

./setup_environment.sh

4. Run the Application

In the terminal, run the following command to start the application:

make app

Note: You will likely need to run some of these commands with sudo privileges, in which case you can prepend sudo to the command.

Setup (Docker)

1. Download Docker Desktop

Go to https://www.docker.com/products/docker-desktop and download the installer for your operating system.

2. Install Docker Desktop

Run the installer and follow the on-screen instructions to install Docker Desktop.

3. Clone the Repository

Open the terminal and then run

git clone https://github.com/MSUSEL/wsu-pique-graph-isomorphism.git

or

Download directly from Github and unzip it.

4. Build and Deploy the Docker Image

Navigate to the directory where the repository was cloned or unzipped, and run the following command to build and deploy the Docker image:

docker compose up 

Using the Application

Once you've completed either version of the setup, you can open a web browser and enter the following URL to access the application: http://localhost:8501

In the web browser, you should see the application interface. Using the web interface, you can upload source code files, analyze and visualize the results.

Key Definitions:

  • Motifs are patterns of code that represent common programming constructs or idioms. They are used to identify and analyze similar code structures across different codebases.
  • Hosts are the specific instances of code in a codebase that may exhibit similar behavior or structure to the defined motifs.

Web Interface Example

  1. First, you will need a single file of code. This file can be any file type thats supported by Joern. For the example, we suggest using the following "Hello World" C++ code.
#include <iostream>

int say_hello() {
std::cout << "Hello, World!" << std::endl;
return 0;
}

int main() {
return say_hello();
}

Note: Not all uploaded files will produce a processed file

  1. Next, with the web interface up, you will upload the source code as both a motif and a host (The motif instance name is arbitrary, but for this example, we use "Hello").
  2. Once the file has been uploaded as both a motif and host, you should see the file in the processed files tab.

processed files

  1. You can then use the application to analyze and visualize the results, and then download those results.

visualization example

Here, we can see that the "Hello" motif has been successfully matched with the host code.

Command Reference

The application provides several endpoints for interacting with the application programmatically:

make main

Processes Motifs and Hosts, creates CPGs, and finds structural matches.
Output: results.json

make evaluate

Will create a PNG visualization of comparison results JSON file for a more straightforward evaluation.

Note: It contains similarity measures that do not consider the set threshold of incidental similarity necessary to impact quality negatively.

make unpack

Will unpack files from the DiverseVul data set and load them directly into the application.

make proto

Will take motif files and converge them into the proto profiles to reduce the number of files processed whenever processing hosts.

Developer Tools

For all possible prebuilt commands, please refer to the Makefile or for a more in-depth look at the processing, some good starting points are:

Troubleshooting

In most cases, when setup or installation issues arise, checking the logs and error messages can provide insight into the problem. Common issues include missing dependencies, incorrect configurations, or permission errors. If you encounter issues, consider the following steps:

  1. Review the terminal output for any error messages.
  2. Ensure all dependencies are installed and up to date.
  3. Check file permissions and ownership.
  4. Consult the documentation for any specific configuration requirements.

If all else fails, repeat the installation process from the beginning, ensuring that all steps are followed correctly.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For any inquiries or issues, please contact MSU Engineering Lab

Contributors