nvtx
Here are 11 public repositories matching this topic...
Hooked CUDA-related dynamic libraries by using automated code generation tools.
-
Updated
Dec 12, 2023 - C
A safe Rust FFI binding for the NVIDIA® Tools Extension SDK (NVTX).
-
Updated
Jan 18, 2024 - Rust
PyProf2: PyTorch Profiling tool
-
Updated
Jun 25, 2020 - Python
Thin pybind11 wrapper for NVTX wrappers -- with some bells and whistles attached.
-
Updated
May 6, 2021 - Python
🎬 Explore GPU training efficiency with FP32 vs FP16 in this modular lab, utilizing Tensor Core acceleration for deep learning insights.
-
Updated
Sep 6, 2025 - Python
Profiling with Precision. Documenting with Style.
-
Updated
Sep 3, 2025 - Jupyter Notebook
A reproducible GPU benchmarking lab that compares FP16 vs FP32 training on MNIST using PyTorch, CuPy, and Nsight profiling tools. This project blends performance engineering with cinematic storytelling—featuring NVTX-tagged training loops, fused CuPy kernels, and a profiler-driven README that narrates the GPU’s inner workings frame by frame.
-
Updated
Sep 5, 2025 - Python
Improve this page
Add a description, image, and links to the nvtx topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the nvtx topic, visit your repo's landing page and select "manage topics."