BERTVision
A parameter-efficient compression model architecture for NLP tasks achieving BERT-level performance at a fraction of the computational requirements.
Research UC Berkeley
Skills
Deep Learning NLP Parameter Efficiency Model Compression
Tools
TensorFlow PyTorch Python Azure
BERTVision is a parameter-efficient compression model architecture that achieves BERT-level performance on NLP tasks while significantly reducing computational and training costs.
Overview
The project leverages hidden state activations from BERT transformer layers that are typically discarded during inference, enabling near-BERT performance with reduced training time and GPU/TPU requirements.
Key Achievements
- Successfully demonstrated near-BERT performance across multiple NLP tasks
- Evaluated on Stanford Question Answering Dataset 2.0 (SQuAD 2.0)
- Benchmarked against General Language Understanding Evaluation (GLUE)
- Reduced computational requirements through parameter sharing and transfer learning
Technologies
Built with TensorFlow and PyTorch, utilizing Azure cloud infrastructure with NVIDIA Tesla V100 GPUs for training. Includes hyperparameter optimization using hyperopt and model ensembling techniques.