hardware-accelerator

Star

Here are 39 public repositories matching this topic...

cucapra / filament

Star

Fearless hardware design

fpga type-system hardware-description-language hardware-accelerator

Updated Aug 20, 2025
Verilog

BoooC / CNN-Accelerator-Based-on-Eyeriss-v2

Star

A Flexible and Energy Efficient Accelerator For Sparse Convolution Neural Network

deep-neural-networks accelerator convolutional-neural-networks sparse-matrix gemm noc im2col hardware-accelerator eyeriss-v2

Updated Jul 22, 2025
Verilog

metr0jw / Event-Driven-Spiking-Neural-Network-Accelerator-for-FPGA

Star

Energy-efficient Event-driven Spiking Neural Network accelerator for FPGA with PyTorch integration

pytorch embedded-systems computational-neuroscience verilog xilinx spiking-neural-networks vivado pynq verilog-hdl snn neuromorphic-computing leaky-integrate-and-fire-neuron lif-neuron spiking-neural-network vitis hardware-accelerator lif-model

Updated May 1, 2026
VHDL

SneakySnake:snake: is the first and the only pre-alignment filtering algorithm that works efficiently and fast on modern CPU, FPGA, and GPU architectures. It greatly (by more than two orders of magnitude) expedites sequence alignment calculation for both short and long reads. Described in the Bioinformatics (2020) by Alser et al. https://arxiv.o…

fpga gpu smith-waterman needleman-wunsch sequence-alignment long-reads minimap2 short-reads hardware-accelerator sequence-aligner edlib pre-alignment-filtering wavefront-alignment

Updated Mar 31, 2023
VHDL

yonseicasl / NPUsim

Star

NPUsim: Full-Model, Cycle-Level, and Value-Aware Simulator for DNN Accelerators

machine-learning acceleration simulator deep-learning memory model accelerator architecture modeling hierarchy cycle-accurate deep-neural-network npu hardware-accelerator full-system value-aware functional-simulation timing-simulation

Updated Jan 2, 2025
C++

dromara / rsmedia

Sponsor

Star

audio/video toolkit based FFmpeg 6.x, 7.x supported for multimedia with Hardware Acceleration.

audio video ffmpeg multimedia media audio-streaming video-streaming mux hardware-acceleration encoder-decoder muxer hardware-accelerator

Updated Mar 24, 2026
Rust

Intuity / nexus

Star

Open source RTL simulation acceleration on commodity hardware

hardware simulation rtl verilog systemverilog hardware-acceleration hardware-accelerator hardware-accelerators

Updated Apr 13, 2023
Python

cogsys-tudelft / chameleon

Star

Chameleon: A Multiplier-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data

open-hardware dnn edge-computing tcn continual-learning few-shot-learning hardware-accelerator ai-accelerator

Updated Mar 5, 2026
Python

yonseicasl / NeuroSpector

Star

NeuroSpector: Dataflow and Mapping Optimizer for Deep Neural Network Accelerators

machine-learning acceleration deep-learning mapping optimization accelerator scheduler architecture scheduling optimizer dataflow hierarchy deep-neural-network hardware-accelerator

Updated Mar 20, 2025
C++

CMU-SAFARI / GenStore

Star

GenStore is the first in-storage processing system designed for genome sequence analysis that greatly reduces both data movement and computational overheads of genome sequence analysis by exploiting low-cost and accurate in-storage filters. Described in the ASPLOS 2022 paper by Mansouri Ghiasi et al. at https://people.inf.ethz.ch/omutlu/pub/GenS…

ftl ssd sequence-alignment read-mapping long-reads hardware-accelerator near-data-processing pre-alignment-filtering in-storage-processing exact-matching

Updated Apr 6, 2022
C

pccxai / pccx-FPGA-NPU-LLM-kv260

Star

Bare-metal FPGA implementation of the pccx NPU for LLM inference on Kria KV260: SystemVerilog RTL, W4A8 quantization, GEMM/GEMV datapaths, KV-cache scheduling, and driver code.

asic fpga parallel-computing transformer xilinx isa systemverilog quantization gemma xilinx-fpga inference-engine npu edge-ai hardware-accelerator kv260 llm gemma3n gemma4

Updated May 5, 2026
SystemVerilog

yonseicasl / NPUWattch

Star

NPUWattch: ML-based Power, Area, and Timing Modeling for Neural Accelerators

machine-learning library acceleration simulator simulation model accelerator architecture modeling timing estimation power estimator area pdk npu neural-accelerator hardware-accelerator post-layout

Updated Apr 13, 2026
HCL

certainly-param / garuda-accelerator

Star

Garuda: CVXIF coprocessor optimizing batch-1 attention microkernels with 7.5-9× lower p99 latency. RISC-V INT8 MAC accelerator for transformer inference.

machine-learning neural-network inference simd low-latency systemverilog attention-mechanism risc-v int8 systemverilog-hdl systolic-arrays edge-ai hardware-accelerator int8-quantization cva6 custom-instructions ai-hardware cvxif

Updated Jan 23, 2026
SystemVerilog

sfu-arch / muir-sim

Star

muIR C++ Simulator

simulation dpi verilator hardware-accelerator muir-sim

Updated May 30, 2022
C++

AhmedSobhy01 / convolution-accelerator

Star

Hardware accelerator for 2D convolution using an 8×8 weight-stationary systolic array with split-kernel support, dual-port SRAM architecture, and DMA-based streaming

fpga verilog sram convolution vlsi dma digital-design systolic-arrays rtl-design hardware-accelerator systolic-array

Updated Feb 8, 2026
Verilog

pccxai / pccx

Star

PCCX is an open NPU architecture for memory-bound Transformer inference on edge FPGAs, focused on GEMM/GEMV, KV-cache, W4A8 quantization, and custom ISA scheduling.

fpga neural-network parallel-computing transformer rtl isa deeplearning systemverilog computer-architecture quantization gemm inference-engine npu edge-ai hardware-accelerator gemv llm llm-inference llm-accelerator

Updated May 5, 2026
SystemVerilog

BhattSoham / Implementation-of-Runge-Kutta-Hardware-Accelerator

Star

Hardware Accelerator implementation for solving an ordinary differential equation using Runge Kutta Numerical methods using VHDL language

vhdl computer-architecture runge-kutta xilinx-vivado hardware-accelerator zynq-zc702 ode-solvers

Updated Mar 9, 2024
VHDL

Raghavan-04 / Systolic-Tensor-Core

Star

Systolic-Tensor-Core, References the "Systolic Array" architecture used in TPUs.

systemverilog pwm hardware-accelerator sync-fifo 2stage-mac

Updated Mar 12, 2026
C++

S1ddharthhh / Hardware-Accelerated_LeNet-5_in_PYNQ-Z2

Star

Hi everyone !! Here i have modified the vanila LeNet-5 model slightly and trained with the german traffic sign benchmark dataset. So by analysing the computation heavy layers i have designed an IP using Vitis HLS 2024.1 and implemented it in the PYNQ Z2 Platform.

python acceleration fpga cnn pynq-z2 hardware-accelerator

Updated Dec 15, 2025
C

Raveem13 / axi-matrix-accelerator

Star

This project implements AXI-based matrix multiply accelerator.

accelerator verilog systemverilog hardware-designs hardware-acceleration axi xsim vivado-simulator hardware-accelerator uvm-verification matrix-multiply-accumulate ai-accelerator

Updated Mar 7, 2026
SystemVerilog

Improve this page

Add a description, image, and links to the hardware-accelerator topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hardware-accelerator topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hardware-accelerator

Here are 39 public repositories matching this topic...

cucapra / filament

BoooC / CNN-Accelerator-Based-on-Eyeriss-v2

metr0jw / Event-Driven-Spiking-Neural-Network-Accelerator-for-FPGA

CMU-SAFARI / SneakySnake

yonseicasl / NPUsim

dromara / rsmedia

Intuity / nexus

cogsys-tudelft / chameleon

yonseicasl / NeuroSpector

CMU-SAFARI / GenStore

pccxai / pccx-FPGA-NPU-LLM-kv260

yonseicasl / NPUWattch

certainly-param / garuda-accelerator

sfu-arch / muir-sim

AhmedSobhy01 / convolution-accelerator

pccxai / pccx

BhattSoham / Implementation-of-Runge-Kutta-Hardware-Accelerator

Raghavan-04 / Systolic-Tensor-Core

S1ddharthhh / Hardware-Accelerated_LeNet-5_in_PYNQ-Z2

Raveem13 / axi-matrix-accelerator

Improve this page

Add this topic to your repo