Skip to main navigation menu Skip to main content Skip to site footer

Hierarchical Semantic-Structural Encoding for Compliance Risk Detection with LLMs

Abstract

This paper addresses the challenges of structural complexity, semantic density, and task diversity in financial regulatory texts. It proposes a risk identification method based on large language models that integrates structure awareness and task adaptation. The method builds basic semantic representations using a pre-trained language model. It also introduces a hierarchical semantic-structural encoding mechanism to explicitly capture logical relationships among clause numbers, substructure hierarchies, and responsible entities in regulatory texts. A dynamic task adaptation module is incorporated to construct task-aware representations and multi-task branches. This allows the model to distinguish between various risk types, such as compliance gaps and responsibility conflicts. To evaluate the performance of the proposed method, a risk identification dataset based on real financial regulatory documents is constructed. Sensitivity experiments are conducted across several dimensions, including structural integrity disturbance, sampling ratio variation, and encoding depth change. The experimental results show that the method achieves high accuracy and robustness. At the same time, it demonstrates strong task adaptability and structural awareness. This provides effective technical support for complex semantic understanding and risk factor modeling in financial regulatory texts.

pdf