Skip to content
Back to Projects

BERTVision

A parameter-efficient compression model architecture for NLP tasks achieving BERT-level performance at a fraction of the computational requirements.

Research UC Berkeley

Skills

Deep Learning NLP Parameter Efficiency Model Compression

Tools

TensorFlow PyTorch Python Azure

BERTVision is a parameter-efficient compression model architecture that achieves BERT-level performance on NLP tasks while significantly reducing computational and training costs.

Overview

The project leverages hidden state activations from BERT transformer layers that are typically discarded during inference, enabling near-BERT performance with reduced training time and GPU/TPU requirements.

Key Achievements

  • Successfully demonstrated near-BERT performance across multiple NLP tasks
  • Evaluated on Stanford Question Answering Dataset 2.0 (SQuAD 2.0)
  • Benchmarked against General Language Understanding Evaluation (GLUE)
  • Reduced computational requirements through parameter sharing and transfer learning

Technologies

Built with TensorFlow and PyTorch, utilizing Azure cloud infrastructure with NVIDIA Tesla V100 GPUs for training. Includes hyperparameter optimization using hyperopt and model ensembling techniques.