Application of Neural Rendering

Published:

Technologies Used

  • ML Framework: PyTorch, Nerfstudio
  • GPU Computing: CUDA, WebGPU
  • Languages: Python, TypeScript
  • Libraries: webgpu-torch, Viser (3D visualization)

My Contributions

Summer 2025 (CMPT 494)

Worked with Cameron on porting PAPR (Point-based Adaptive Point Rendering) into Nerfstudio framework.

Key Contributions:

  • Porting PAPR paper’s codebase to work NeRFStudio
  • Config Optimization: Explored hyperparameter settings to improve training results
  • GPU Memory Debugging: Resolved memory issues occurring during training
  • Scheduler Re-initialization Bug Fix: Fixed NaN value issues through fast-forward scheduling and learning rate tuning
  • .ply File Export: Implemented functionality to export trained models as point cloud format
  • Unit Test Development: Wrote unit tests for core components like PAPRField

Training Results Visualization Training progress at 240k and 245k iterations showing Ground Truth vs Predicted, Depth Map, Point Cloud, and Training Losses:

Training Results at 240k iterations

Training Results at 245k iterations

Fall 2025 (CMPT 495)

Led the Client-side Rendering with WebGPU-torch implementation.

Goal: Enable PAPR model rendering directly in the browser without Nerfstudio backend to eliminate network latency

Key Contributions:

  • PAPR Model TypeScript Porting: Re-implemented PyTorch-based PAPR model in TypeScript using webgpu-torch library
  • Missing Kernel Implementation: Implemented tensor operations missing in webgpu-torch (clamp, permute, softmax, transpose, conv2d, slice, sliceCopy, etc)
  • Debugging Large Tensor Support: Extended 1D dispatch to 2D dispatch to handle tensors with 16M+ elements
  • Performance Analysis:
    • Analyzed kernel call overhead through Sin Benchmark
    • Measured execution time distribution using Chrome DevTools GPU profiling
    • Identified that 68.9% of render time was synchronization overhead (23,000+ WebGPU events for 512x512 render)

Results:

  • Successfully rendered in browser for single test case
  • 2.5x~4x slower render time compared to PyTorch (Assuming it’s due to WebGPU synchronization overhead)
  • Documented limitations and improvement directions for webgpu-torch library
ResolutionPatchesRender TimeKernel Calls
64×6410.38s567
256×256162.74s5,502
512×5126411.35s21,366

Project Architecture

The project consists of multiple interconnected repositories:

3d-editor (main)
├── nerfstudio/
│   ├── models/papr_model.py          # Main PAPR model
│   ├── fields/papr_field.py          # Point-based field with attention
│   ├── pipelines/papr_pipeline.py    # Training pipeline
│   └── viewer/custom/                # Point cloud editor
├── papr-cuda-topk/                   # CUDA kernel optimization
└── viser/                            # Custom Viser fork
    └── webgpu-torch/                 # WebGPU-torch fork (my focus)

Project Information

  • Supervisor: Ke Li (keli@sfu.ca)
  • Duration: May - December 2025
  • Team members: Team of two (Summer) + one more member (Fall)

References