Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (6): 2084-2097.DOI: 10.13196/j.cims.2024.0303
Previous Articles Next Articles
ZUO Daiyue,JIANG Wenbo,ZHENG Hangbin,BAO Jinsong+
Online:
Published:
左戴悦,蒋文波,郑杭彬,鲍劲松+
作者简介:
Abstract: Facing the challenge of interpretability of deep learning approaches in the PV panel defect recognition task,an interpretable PV defect visual question and answer framework driven by data and knowledge fusion was proposed.A tandem deep learning model was used to perform the task of PV panel defect recognition.Then,an image-text multimodal model was fine-tuned to learn expert knowledge and evaluate the explanation images obtained by Grad-CAM method.Finally,a specialized prompt template was designed to integrate information from multiple stages into the form of natural language dialogues.Based on the multi-modal large language,the defect recognition model was interpreted,and the application of the model results was extended.The interpretability of the defect detection model was enhanced to achieve accurate and reliable visual question answering,which improved the efficiency and usability of the PV panel defect recognition task.
Key words: photovoltaic panel, defect recognition, multi-modal, interpretability, visual reasoning
摘要: 针对光伏板缺陷识别任务中深度学习方法的可解释性,提出一种数据和知识融合驱动的可解释光伏缺陷视觉问答框架。首先采用串联的深度学习模型执行光伏板缺陷识别任务;然后微调图文多模态模型以学习专家知识对检测模型的热注意力图的评价;最后设计专用提示词模板,将来自多个层级的信息整合到自然语言对话的形式中,基于多模态大语言模型解释光伏板缺陷识别模型,拓展了检测结果的工业场景适用性,增强了缺陷检测模型的可解释性,实现了准确可靠的视觉问答,提升了光伏板缺陷识别任务的效率和可用性。
关键词: 光伏板, 缺陷识别, 多模态, 可解释性, 视觉推理
CLC Number:
TP391.41
TM615
ZUO Daiyue, JIANG Wenbo, ZHENG Hangbin, BAO Jinsong. Interpretable visual question answering method for defect recognition[J]. Computer Integrated Manufacturing System, 2025, 31(6): 2084-2097.
左戴悦, 蒋文波, 郑杭彬, 鲍劲松. 面向缺陷识别的可解释视觉问答方法[J]. 计算机集成制造系统, 2025, 31(6): 2084-2097.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.cims-journal.cn/EN/10.13196/j.cims.2024.0303
http://www.cims-journal.cn/EN/Y2025/V31/I6/2084