My work on the tiny-gpu project, originally found at and forked from https://github.com/adam-maj/tiny-gpu, consists of two major parts - a bare minimum assembler to make verification of the design easier, and a set of caches for both program and data memory.
Project details, source code, and examples available at https://github.com/dsandall/tiny-gpu-assembler. I would suggest checking out the README for an overview of the scope of the project, and tiny-assembler.pdf for usage instructions.
Project details, source code, and verification tests available at https://github.com/dsandall/tiny-gpu.
I would highly suggest also reading through the project's writeup (https://github.com/dsandall/tiny-gpu/blob/master/tiny-cache-paper.pdf) for a better understanding of the architecture, my thought process, and the final results.