Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / DevOps / Git

A Git Query Language written in Rust

0.00/5 (No votes)
28 Jul 2023CPOL3 min read 4.4K  
Introduce GQL (A Git Query Language) to perform SQL like queries on .git files
Introduce the design and implementation of a GQL (Git Query Language) with its features and how it implemented using Rust Programming language.

Introduction

Last month, I got interested in Rust programming language and want to discover more about it. So I started learning the basics and seeing open source projects written in Rust. I also created one PR in the Rust analyzer project; it does not depend on my knowledge of Rust but on my general knowledge of Compilers and Static analysis. As usual, I love to learn new things by creating new projects with ideas that I am interested in.

The Idea

I started to think about small ideas that I love to use, for example, a faster search CLI or some utility apps. But then, I got a new cool idea.

While reading the Building git book (a book about building git from scratch), I learned what each file inside the .git folder does and how git stores commits, branches and other data and manages its own database. So what if we have a query language that runs on those files?

The Git Query Language (GQL)

I decided to implement this query language, and I named it GQL. I was very excited to start this project because it was my first time implementing a query language. I decided to implement it from scratch, not converting .git files into an SQLite database and running normal SQL queries. And I thought it will be cool if, in the future, I can use the GQL engine as a part of a Git client or analyzer.

The Implementation of GQL

The goal is to implement it into two parts. The first one is converting the GQL query into AST of nodes, then passing it to the engine to walk and execute it as an interpreter or in the future to convert this into virtual matching for GQL Byte code instructions.

The engine has the functionality to deal with .git files using the Rust binding for git2 library so it can perform selecting, updating and deleting tasks, also storing the selected data into a data structure so we can perform filtering or sorting.

To simplify this implementation, I created a struct called GQLObject that can represent commit, branch, tag or any other object in this engine also to make it easy to perform sorting, searching, and filtering with single functions that deal with this type.

Rust
pub struct GQLObject {
  pub attributes: HashMap<String, String>,
}

The GQLObject is just a map of string as a key and value, so it can be general to put the info of any type. And now features like comparisons, filtering or sorting can be implemented easily on this strings map.

The Current State

Now it possible to use many of the SQL features such as aggregations, group by, order by, where and having statements, limit and offset, etc.

SQL
SELECT name, count(name) AS commit_num FROM commits _
             GROUP BY name ORDER BY commit_num DES LIMIT 10
SELECT commit_count FROM branches WHERE commit_count BETWEEN 0 .. 10

SELECT * FROM refs WHERE type = "branch"
SELECT * FROM refs WHERE ORDER BY type

SELECT * FROM commits
SELECT name, email FROM commits
SELECT name, email FROM commits ORDER BY name DES
SELECT name, email FROM commits WHERE name contains "gmail" ORDER BY name
SELECT * FROM commits WHERE name.lower = "amrdeveloper"
SELECT name FROM commits GROUP By name
SELECT name FROM commits GROUP By name having name = "AmrDeveloper"

SELECT * FROM branches
SELECT * FROM branches WHERE ishead = "true"
SELECT * FROM branches WHERE name ends_with "master"
SELECT * FROM branches WHERE name contains "origin"

SELECT * FROM tags
SELECT * FROM tags OFFSET 1 LIMIT 1

For example, selecting top n contributors name and number of commits.

SQL
SELECT name, count(name) AS commit_num FROM commits _
             GROUP BY name ORDER BY commit_num DES LIMIT 10

Image 1

The Next Step

Now, the next step is to optimize the code and start to support more features, for example, imaging query for deleting all branches except the master.

SQL
delete * from branches where name ! "master"

Also, we can split the project into small libraries so we can use it in many different ways like integrate it into Git Client or IDE.

The GQL project is a free open source, so everyone is most welcome to contribute, suggest features or report bugs.

Install or Build

You can find the full guide on how to install it using binaries or Cargo or building it from source code on the project website.

Github: https://github.com/AmrDeveloper/GQL

I look forward to your opinion and feedback.

You can find me on GitHub.

I hope you enjoyed my article. Enjoy programming!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)