MENU

Cord Path

Repository

I developed Cord Path to explore the intersection of high-level systems programming in Rust and low-level GPGPU acceleration with CUDA. This CLI tool solves the Traveling Salesman Problem (TSP) by offloading the computationally expensive distance matrix calculations to the GPU, followed by heuristic optimization in Rust. By utilizing hardware acceleration, the system can handle large coordinate sets that would be prohibitively slow on a CPU-only architecture.

CUDA-Accelerated Distance Matrix

The most significant bottleneck in TSP solvers is the O(N²) calculation of point-to-point distances. To overcome this, I implemented a custom CUDA kernel that computes the entire distance matrix in parallel on the GPU. By leveraging thousands of threads to calculate Euclidean distances simultaneously, the tool achieves near-instant preprocessing even for thousands of coordinates. The Rust side interacts with the GPU via a static library and FFI (Foreign Function Interface), ensuring zero-cost abstractions between the two languages.

Pathfinding Heuristics

Once the distance matrix is populated, the Rust engine takes over to generate a valid path. I implemented a Nearest Neighbor algorithm to establish a greedy initial solution. This heuristic starts at a user-defined coordinate and iteratively visits the closest unvisited node. This provides a fast, "good enough" baseline that can then be refined by more complex optimization passes.

2-opt Local Optimization

To further improve the initial path, I implemented the 2-opt algorithm—a local search heuristic that removes edge crossings. The algorithm iteratively swaps pairs of edges if the swap results in a shorter total path length. This optimization pass typically reduces the path distance by 4-10%, transforming a rough greedy route into a highly efficient trajectory. The implementation uses Rust's mutable slices and iterators to achieve high performance during the search.

Performance Results

The combination of CUDA and Rust yields exceptional performance. In a benchmark with 5,000 coordinates, the distance matrix was computed in under 5ms on an NVIDIA RTX 30-series GPU—a task that takes significantly longer on a single CPU core. The final tool is not only accurate but also highly scalable, making it suitable for industrial pathfinding applications where coordinate sets are large and complex.