基于区间分配的动态分配轮盘生成式语言隐写术
Dynamically Allocated Interval-Based Generative Linguistic Steganography with Roulette Wheel
摘要 Abstract
现有的语言隐写术方案往往忽视候选池中标记(token)的条件概率(CP),将相同的编码分配给所有标记,导致选择可能性相同。这种方法倾向于选择低CP值的标记,降低了隐写对象(stegos)的质量,使其更容易被检测到。本文提出了一种基于区间分配的方案,称为DAIRstega。DAIRstega首先利用部分读取的秘密构建轮盘区域。然后,该方案采用轮盘轮的思想,并以标记的CP为主要依据分配轮盘区域(即区间长度)。因此,具有较大CP的标记会获得更多的区域。秘密选择高CP标记的可能性增加。在分配过程中,我们设计了一些分配函数和三个约束条件以优化过程。此外,DAIRstega支持基于提示的可控生成隐写对象。丰富的实验表明,所提出的嵌入方式和DAIRstega相较于现有方法和基线表现更优,展现出强大的感知、统计和语义隐藏能力以及反隐写分析能力。它还可以生成高质量的更长隐写对象,弥补了该任务中的不足。DAIRstega被证实具有作为安全水印的潜力,并为相关领域的发展提供了启示。
Existing linguistic steganography schemes often overlook the conditional probability (CP) of tokens in the candidate pool, allocating the one coding to all tokens, which results in identical selection likelihoods. This approach leads to the selection of low-CP tokens, degrading the quality of stegos and making them more detectable. This paper proposes a scheme based on the interval allocated, called DAIRstega. DAIRstega first uses a portion of the read secret to build the roulette area. Then, this scheme uses the idea of the roulette wheel and takes the CPs of tokens as the main basis for allocating the roulette area (i.e., the interval length). Thus, tokens with larger CPs are allocated more area. The secret will have an increased likelihood of selecting a token with a higher CP. During allocation, we designed some allocation functions and three constraints to optimize the process. Additionally, DAIRstega supports prompt-based controllable generation of stegos. Rich experiments show that the proposed embedding way and DAIRstega perform better than the existing ways and baselines, which shows strong perceptual, statistical, and semantic concealment, as well as anti-steganalysis ability. It can also generate high-quality longer stegos, addressing the deficiencies in this task. DAIRstega is confirmed to have potential as a secure watermarking, offering insights for its development.