Balancing Performance and Efficiency in Large Language Model Fine-Tuning through Hierarchical Freezing
Abstract
This study investigates the efficiency of fine-tuning in large language models and proposes an optimization method based on hierarchical parameter freezing. The method divides model parameters into three levels: lower, middle, and upper. It adopts freezing, partial updating, and full updating strategies for these levels, respectively, to balance stability and adaptability. The lower-level parameters preserve general syntax and semantic knowledge. The middle-level parameters are flexibly controlled according to task complexity. The upper-level parameters focus on task-specific semantic modeling. In this way, the method reduces computational and storage costs significantly while maintaining performance. To verify the effectiveness of the method, systematic comparison experiments were conducted. Multiple metrics, including ROUGE, BLEU, and EM, were evaluated. The results show that the proposed method achieves a better balance between performance and efficiency. In addition, sensitivity experiments were carried out in three dimensions: hyperparameters, environmental settings, and data conditions. The analysis covered learning rate, sequence length, training data ratio, and text noise. The findings further demonstrate the robustness of the method under diverse conditions. By combining hierarchical freezing mechanisms with parameter updating strategies, this study provides a new approach for efficient use of large language models in resource-constrained environments. It also confirms the broad applicability of the method in real-world tasks.