Application of Neural Rendering
Published:
Technologies Used
- ML Framework: PyTorch, Nerfstudio
- GPU Computing: CUDA, WebGPU
- Languages: Python, TypeScript
- Libraries: webgpu-torch, Viser (3D visualization)
My Contributions
Summer 2025 (CMPT 494)
Worked with Cameron on porting PAPR (Point-based Adaptive Point Rendering) into Nerfstudio framework.
Key Contributions:
- Porting PAPR paper’s codebase to work NeRFStudio
- Config Optimization: Explored hyperparameter settings to improve training results
- GPU Memory Debugging: Resolved memory issues occurring during training
- Scheduler Re-initialization Bug Fix: Fixed NaN value issues through fast-forward scheduling and learning rate tuning
- .ply File Export: Implemented functionality to export trained models as point cloud format
- Unit Test Development: Wrote unit tests for core components like PAPRField
Training Results Visualization Training progress at 240k and 245k iterations showing Ground Truth vs Predicted, Depth Map, Point Cloud, and Training Losses:


Fall 2025 (CMPT 495)
Led the Client-side Rendering with WebGPU-torch implementation.
Goal: Enable PAPR model rendering directly in the browser without Nerfstudio backend to eliminate network latency
Key Contributions:
- PAPR Model TypeScript Porting: Re-implemented PyTorch-based PAPR model in TypeScript using webgpu-torch library
- Missing Kernel Implementation: Implemented tensor operations missing in webgpu-torch (clamp, permute, softmax, transpose, conv2d, slice, sliceCopy, etc)
- Debugging Large Tensor Support: Extended 1D dispatch to 2D dispatch to handle tensors with 16M+ elements
- Performance Analysis:
- Analyzed kernel call overhead through Sin Benchmark
- Measured execution time distribution using Chrome DevTools GPU profiling
- Identified that 68.9% of render time was synchronization overhead (23,000+ WebGPU events for 512x512 render)
Results:
- Successfully rendered in browser for single test case
- 2.5x~4x slower render time compared to PyTorch (Assuming it’s due to WebGPU synchronization overhead)
- Documented limitations and improvement directions for webgpu-torch library
| Resolution | Patches | Render Time | Kernel Calls |
|---|---|---|---|
| 64×64 | 1 | 0.38s | 567 |
| 256×256 | 16 | 2.74s | 5,502 |
| 512×512 | 64 | 11.35s | 21,366 |
Project Architecture
The project consists of multiple interconnected repositories:
3d-editor (main)
├── nerfstudio/
│ ├── models/papr_model.py # Main PAPR model
│ ├── fields/papr_field.py # Point-based field with attention
│ ├── pipelines/papr_pipeline.py # Training pipeline
│ └── viewer/custom/ # Point cloud editor
├── papr-cuda-topk/ # CUDA kernel optimization
└── viser/ # Custom Viser fork
└── webgpu-torch/ # WebGPU-torch fork (my focus)
Project Information
- Supervisor: Ke Li (keli@sfu.ca)
- Duration: May - December 2025
- Team members: Team of two (Summer) + one more member (Fall)
