A Lightweight Model for Balancing Efficiency and Precision in PEFT-Optimized Java Unit Test Generation
No Thumbnail Available
Date
2025-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Software testing accounts for nearly 50% of development costs while being critical for ensuring software
quality, creating an urgent need for more efficient testing solutions. This work addresses this challenge
by developing an innovative framework that combines Parameter-Efficient Fine-Tuning (PEFT) techniques
with transformer models to automate Java unit test generation. The study systematically evaluates three
PEFT approaches—Low-Rank Adaptation(LoRA), Quantized LoRA (QLoRA), and Adapters—through
a rigorous methodology involving specialized assertion pretraining using the Atlas dataset (1.2M Java
method-assertion pairs), PEFT optimization, targeted fine-tuning with Methods2Test (780K test cases),
and comprehensive validation on the unseen Defects4J benchmark to assess cross-project generalization.
Experimental results demonstrate that LoRA maintains 92% full fine-tuning effectiveness (38.12% correct
test cases) while reducing GPU memory requirements by 17% and improving generation speed by 23%.
QLoRA achieves even greater efficiency with 36% memory reduction, making it particularly suitable for
resource-constrained environments. However, evaluation on Defects4J, assessing cross-project generalization,
showed that LoRA achieved 43.1% correct assertions (compared to a full fine-tuning baseline of
46.0% on Defects4J), indicating a minor reduction in generalization alongside the efficiency gains. Despite
these promising advancements, it’s important to note that our findings are currently contextualized by the
Java programming language and the specific datasets employed in our experiments. These findings provide
valuable insights for the implementation of AI-powered test generation in practice, highlighting both the
potential of PEFT techniques to reduce testing costs and the need for further research to address the nuances
of maintaining test quality across diverse projects.
Description
Keywords
Software testing accounts for nearly 50% of development costs while being critical for ensuring software quality, creating an urgent need for more efficient testing solutions. This work addresses this challenge by developing an innovative framework that combines Parameter-Efficient Fine-Tuning (PEFT) techniques with transformer models to automate Java unit test generation. The study systematically evaluates three PEFT approaches—Low-Rank Adaptation(LoRA), Quantized LoRA (QLoRA), and Adapters—through a rigorous methodology involving specialized assertion pretraining using the Atlas dataset (1.2M Java method-assertion pairs), PEFT optimization, targeted fine-tuning with Methods2Test (780K test cases), and comprehensive validation on the unseen Defects4J benchmark to assess cross-project generalization. Experimental results demonstrate that LoRA maintains 92% full fine-tuning effectiveness (38.12% correct test cases) while reducing GPU memory requirements by 17% and improving generation speed by 23%. QLoRA achieves even greater efficiency with 36% memory reduction, making it particularly suitable for resource-constrained environments. However, evaluation on Defects4J, assessing cross-project generalization, showed that LoRA achieved 43.1% correct assertions (compared to a full fine-tuning baseline of 46.0% on Defects4J), indicating a minor reduction in generalization alongside the efficiency gains. Despite these promising advancements, it’s important to note that our findings are currently contextualized by the Java programming language and the specific datasets employed in our experiments. These findings provide valuable insights for the implementation of AI-powered test generation in practice, highlighting both the potential of PEFT techniques to reduce testing costs and the need for further research to address the nuances of maintaining test quality across diverse projects.