Skip to content

Publications

Research papers and technical reports on machine learning, natural language processing, and AI systems.

View all on Google Scholar

Publications List

2023 Technical Report

STIM: Predicting Memory Uncorrectable Errors with Spatio-Temporal Transformer

Zhexiong Liu, Cris Benge , Siduo Jiang

We present STIM (Spatio-Temporal Inference Model), a transformer-based approach for predicting uncorrectable memory errors in data center servers. By analyzing spatial and temporal patterns in memory telemetry data, STIM enables proactive replacement of failing memory modules before they cause server crashes, significantly improving data center reliability.
2022 arXiv

BERTVision: A Parameter-Efficient Approach for Question Answering

Siduo Jiang, Cris Benge , William Casey King

We present BERTVision, a parameter-efficient approach that leverages BERT's hidden-state activations to achieve competitive performance on visual question answering tasks. Our model architectures are trained on BERT's intermediate representations, enabling efficient transfer learning while maintaining accuracy comparable to full-scale transformer models.
2022 arXiv

Ticket-BERT: Labeling Incident Management Tickets with Language Models

Zhexiong Liu, Cris Benge , Siduo Jiang

We introduce Ticket-BERT, a specialized language model for automating the classification of incident management tickets in enterprise IT environments. Our approach leverages domain-specific fine-tuning to achieve high accuracy in categorizing and routing trouble tickets, significantly reducing manual triage effort and improving response times.
2021 Technical Report

A High Performance Compression Approach for Transformer-Based NLP Tasks

Siduo Jiang, Cris Benge , Andrew Fogarty, William Casey King, Alberto Todeschini, Hossein Vahabi

We present a high-performance compression approach for transformer-based NLP models that achieves 209x reduction in parameters while maintaining accuracy comparable to full BERT models. Our method enables efficient deployment of large language models in resource-constrained environments without significant performance degradation.

Patents

Granted: Aug 25, 2025 Granted

Spatial-temporal memory uncorrectable error prediction system

US-2025-0272192A1

Systems and methods are directed to training and using a spatial-temporal transformer to predict memory errors. The system aggregates historical data including error logs from data centers by time windows and generates, from the aggregated historical data, a spatial representation of the errors and a set of micro features for each time window in an observation period. A memory feature vector is generated for each time window by flattening the spatial representation and appending the corresponding set of micro features to an end of the flattened spatial representation. The spatial-temporal transformer is trained by applying the memory feature vector for each time window to a transformer encoder. This training process is repeat for each observation period within a data collection period. During inference time, a similar process is performed to generate inference memory feature vectors for an inference observation period, which are applied to the trained transformer to predict errors.

Granted: Aug 17, 2025 Granted

System and method for correlating virtual machine interruptions and node characteristics

US-2025-0217177A1

A method, computer program product, and computing system for collecting data concerning interruptions associated with a plurality of virtual machines, and for collecting hardware information concerning one or more nodes hosting the plurality of virtual machines at a time generally contemporaneous with the interruptions of the plurality of virtual machines. A correlation is generated between interruptions of at least a subset of the plurality of virtual machines and one or more hardware component attributes of the one or more nodes.