现代纺织技术 ›› 2025, Vol. 33 ›› Issue (06): 82-90.DOI: 10.12477/xdfzjs.20250610

• • 上一篇    下一篇

基于T-Apriori算法的纺织品质检数据分析

  

  1. 1. 东华大学纺织学院,上海  201620;2. 上海海关学院,上海  201204
  • 收稿日期:2024-10-12 出版日期:2025-06-10 网络出版日期:2025-06-17
  • 作者简介:吕沿沿(2000—),女,硕士研究生,主要从事纺织品质量评价方面的研究
  • 基金资助:
    国家重点研发计划项目(2022YFF0607203)

Analysis of textile quality inspection data based on T-Apriori algorithm

  1. 1. College of Textiles, Donghua University, Shanghai 201620, China; 2. Shanghai Customs College, Shanghai 201204, China
  • Received:2024-10-12 Published:2025-06-10 Online:2025-06-17

摘要: 为解决纺织品质检不合格数据在进行关联规则挖掘时计算时间较长的问题,对传统Apriori算法进行优化,提出了一种基于三元组的数据挖掘方法——T-Apriori算法。该算法的核心思想在于通过将数据的布尔矩阵压缩存储为三元组形式,从而有效减少存储空间并提升计算效率。针对支持度的确定,还提出了一种更为客观的方法,即通过分析候选1项集频数区间对应的候选1项集个数的变化趋势来调整支持度阈值,以更好地适应数据特性。结果表明:T-Apriori算法在与传统Apriori算法及其优化版本C-Apriori算法的比较中,表现出显著的性能提升,其运行时间仅为传统Apriori算法的40%。尤其在数据量较大且支持度较低的情况下,T-Apriori算法的时间下降幅度更为显著,证明其在大数据量和低支持度环境下具有更为优异的处理性能。通过采用T-Apriori算法,纺织品质检数据的分析效率得到了极大提高,为质量监管和决策支持提供了更加高效的数据分析工具,具有重要的实际应用价值。

关键词: 关联规则, Apriori算法, 三元组, 质量检测, 纺织品

Abstract: Currently, research on association rules for textile quality inspection data is still relatively limited and mostly in the preliminary exploration stage. There are few related studies on improvement schemes for the algorithms adopted in this field, and systematic optimization strategies have not yet been formed. Furthermore, using the traditional Apriori algorithm to mine association rules can be time-consuming in dealing with large datasets, and the determination of rule metric thresholds lacks transparency. Therefore, the purpose of this paper is to address the issue of long computation times when mining association rules from unqualified textile quality inspection data, as well as to solve the problem of the subjective and non-transparent determination of the support threshold. 
The T-Apriori algorithm optimized based on the traditional Apriori algorithm was adopted. The core idea of this algorithm lies in compressing and storing the Boolean matrix of data in a triplet form. Specifically, each transaction in the Boolean matrix is converted into a set of triplets, where each non-zero element is represented as a triplet (i, j and v), with i and j being the row and column indices, and v being the value of the element. The dataset scanned by the algorithm is the converted triplet set, and the calculations of support, confidence, and lift are also performed using data retrieved from these triplet sets. Unqualified textile quality inspection data are very sparse, with most elements being zero, and triplets only store non-zero elements, thereby effectively reducing storage space and enhancing computational efficiency. For determining the support threshold, the trend in the number of itemsets corresponding to the frequency of candidate 1-itemsets is analyzed to adjust the support threshold, allowing it to better adapt to the characteristics of the data and identify relatively high-frequency itemsets. 
The experimental results show that the trend in the number of candidate 1-itemsets can be used to identify relatively high-frequent itemsets, and the support threshold for the itemset {brand and total unqualified inspection items} is set to 0.002. The T-Apriori algorithm demonstrates significant performance improvements compared to the traditional Apriori algorithm and its optimized version, C-Apriori. Its runtime is only 40% of that of the traditional Apriori algorithm. As the volume of data increases, the reduction in runtime for the T-Apriori algorithm is even more pronounced, as shown in Fig. 5. The lower the support threshold, the larger the difference in runtime between the T-Apriori algorithm and the traditional Apriori algorithm becomes, indicating a more significant reduction in runtime for the T-Apriori algorithm, as illustrated in Fig. 6. In summary, the T-Apriori algorithm exhibits superior processing performance in environments with large data volumes and low support thresholds. By mining textile quality inspection data from 2018 to 2023, 72 strong association rules are obtained, and based on these rules, two regulatory recommendations are proposed to the supervision department.The adoption of the T-Apriori algorithm greatly improves the analysis efficiency of textile quality inspection data, providing a more efficient data analysis tool for quality supervision and decision support . This has important practical application value.

Key words: association rules, Apriori algorithm, triplets, quality inspection, textiles

中图分类号: