含能材料机器学习研究的数据优化策略

doi:10.11943/CJEM2025098

首页 > 过刊浏览>年第0卷第期 >. DOI:10.11943/CJEM2025098

含能材料机器学习研究的数据优化策略
DOI:
                        10.11943/CJEM2025098
                    
作者:
                        
                        
                    
作者单位:北京理工大学
作者简介:
通讯作者:
基金项目:

Data Optimization Strategies for Machine Learning of Energetic Materials

Author:

Affiliation:

Beijing Institute of Technology

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

支撑附件

摘要:

机器学习作为新兴的数据驱动技术，为含能材料的智能化研发提供了新的技术途径。然而，含能材料的数据稀缺与数据异构性并存的难题，已成为制约其建模精度与推广应用的核心瓶颈。围绕当前含能材料数据的获取路径与存在的问题，从“数量扩展”与“质量提升”两个维度评述了主流的数据优化策略。在数据数量扩展方面，介绍了简化分子线性输入规范（Simplified Molecular Input Line Entry System，SMILES）枚举、生成对抗网络（Generative Adversarial Networks，GANs）与迁移学习等技术在提升模型泛化能力中的应用进展；在数据质量提升方面，探讨了异常值识别、预处理规范、特征工程对增强模型鲁棒性与可解释性的作用。研究表明，合理的数据优化不仅能有效缓解数据匮乏问题，还能显著提升模型在小样本和复杂结构条件下的预测稳定性与结构外推能力。最后，提出构建高通量实验平台、统一数据标准及发展智能化闭环体系的未来方向，为推动含能材料的数据生态构建与智能研发提供了可行路径与方法参考。

Abstract:

Machine learning, as an emerging data-driven technology, has provided a promising pathway for the intelligent research and development of energetic materials. However, data scarcity and heterogeneity have become core bottlenecks restricting modeling accuracy and practical application. This review examines state-of-the-art data acquisition methodologies, analyzing their advantages and limitations. Furthermore, mainstream data optimization strategies are comprehensively discussed from two perspectives: quantity expansion and quality improvement. For data quantity, recent advances in SMILES enumeration, generative adversarial networks, and transfer learning are introduced for enhancing model generalization. For data quality, the roles of outlier detection, standardized preprocessing, and feature engineering in improving model robustness and interpretability are discussed. It is shown that effective data optimization can not only alleviate data limitations but also significantly enhance prediction stability and structural extrapolation capabilities under small-sample and structurally complex conditions. Finally, future directions are proposed, including the development of high-throughput experimental platforms, unification of data standards, and establishment of intelligent closed-loop systems. This is expected to provide a feasible roadmap and methodological reference for advancing the data ecosystem and intelligent design of energetic materials.

参考文献

相似文献

引证文献

文章指标

PDF下载次数:
HTML阅读次数:
摘要点击次数:
引用次数:

引用本文

刘辰昊,张蕾,庞思平. 含能材料机器学习研究的数据优化策略[J]. 含能材料,DOI:10.11943/CJEM2025098.

复制

历史

收稿日期: 2025-05-16
最后修改日期: 2025-06-23
录用日期: 2025-07-07
在线发布日期:
出版日期:

首页

期刊简介

编委会

作者专区

审稿专区

道德声明

联系我们

进入EMF

文章指标

引用本文

历史