Skip to content

johnpalaios/parallel-systems-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parallel-systems-lab

This repository contains the reports and source code written for the lab of the Parallel Processing Systems course of the school of Electrical and Computer Engineering at the National Technical University of Athens.

The lab consists of 4 different exercises (currently finished the first two) :

Exercise 1 : Familiarization with the programming environment

Its main goal was to parallelize a serial version of Conway's Game Of Life on a shared memory architecture using OpenMP's API .

Exercise 2 : Algorithm Parallelization and Optimization in Shared Memory Architectures

The goal was to parallelize the K-means Clustering Algorithm and the Floyd-Warshall Algorithm on a shared memory architecture (NUMA node) using OpenMP's API.

  • For the K-means clustering algorithm, I was assigned to develop two parallel version, the one having shared cluster arrays (between the threads) and updating them with atomic operations and the other having copied clusters for each thread and later reducing them to one final array.

  • Benchmarked and compared 5 different Lock implementations on the K-means Clustering algorithm, having understood the differences in their implementations.

  • For the Floyd-Warshall algorithm, the goal was to parallelize its recursive version (more cache friendly in comparison to the iterative) using OpenMP's Tasks.

  • Benchmarked and compared the serial and parallel version in a NUMA node and observed the different tradeoffs of this architecture.

  • Benchmarked and compared 5 Concurrent Linked List implementations and commented on their differences in performance.

  • The Report

  • More info on /parlab-ex02

Exerise 3 : Algorithm Parallelization and Optimization on GPUs

The goal was to parallelize 4 different versions of the K-means algorithm on a GPU using Nvidia's CUDA API.

  • The first version is called Naive due to non-uniform memory accesses.

  • The second version is called Transpose due to transposing two of the arrays in order to perform uniform memory accesses.

  • The third version is called Shared due to placing the clusters array onto the GPU's shared memory for each thread block.

  • The fourth version is called Full-Offload (All-GPU) due to avoiding CPU and GPU communication between the program's loop and and instead performing the entirety of the loops on the GPU (with minimal communication between them).

  • Thoroughly benchmarked the 4 versions where I observed significant performance improvement to performing the algorithm on the solely on the CPU.

  • As expected (and for reasons explained in the report), the best performing version is the Full-Offload.

  • Plotted the results of the benchmarkes and explained the reasons I observed performance differences between the 4 versions through exploring the GPU's and CUDA's internals.

  • The Report

  • More info on /parlab-ex03

Exerise 4 : Algorithm Parallelization and Optimization on distributed memory architectures.

The goal was to parallelize 2 different algorithms, the K-means and the 2-d Heat Transfer, assuming a distributed memory architecture and using MPI.

  • The K-means algorithm was parallelized by assigning each MPI process different objects and communicating between them in each iteration.

  • The 2 dimensional Heat Transfer problem was solved using the Jacobi method (method for solving partial differential equations) where each MPI process was assigned different (smaller) blocks of the 2d global block and performed communication between them when needed.

  • Highly suggest you take a look at the source code of the jacobi mpi implementation (Link).

  • Benchmarked each version for different configurations and number of MPI processes and plotted the results accordingly.

  • The Report

  • More info on /parlab-ex04

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors