基于KL距离的交互式动态影响图近似算法  被引量:2

Approximate algorithm of interactive dynamic influence diagrams based on KL distance

在线阅读下载全文

作  者:田乐[1] 罗键[1] 曹浪财[1] 陈志平[1] 

机构地区:[1]厦门大学信息科学与技术学院,福建厦门361005

出  处:《系统工程与电子技术》2013年第1期207-211,共5页Systems Engineering and Electronics

基  金:国家自然科学基金(60975052)资助课题

摘  要:交互式动态影响图(interactive dynamic influence diagrams,I-DIDs)状态空间太大,候选模型的数量随时间变化而呈指数倍增长。针对其备受计算量困扰的问题,提出一种利用近似行为等价原理与区别模型更新算法(discriminative model updates,DMU)相结合的近似算法。首先给出了基于Kullback-Leibler(KL)距离模型行为等价和近似行为等价的定义,然后基于KL距离和候选模型的动作对候选模型聚类,自上而下合并策略树形成策略图,最后利用DMU算法进行求解。仿真结果表明,相对于传统的DMU算法,所提近似算法能显著降低候选模型的数量,提高I-DIDs的效率,对I-DIDs的理论及应用研究具有参考价值。The model space of interactive dynamic influence diagrams (I DIDs) is too large and the number of candidate models grows exponentially with the number of time steps. To deal with the high calculation cost issue, a method of solving I-DIDs approximately that combines approximate behavioral principle and discrimina- tive model update algorithm (DMU) is proposed. First, a new definition of behavior equivalence and approxi- mate behavior equivalence of models are presented. Then the candidate models based on the Kullback-Leibler (KL) distance and the action o{ candidate models are clustered. Afterwards, the top to bottom method is used to merge policy trees into policy graphs. Finally, I DIDs are solved by using the approach of DMU. The simula- tion results show that the approximated algorithm can dramatically decrease the number of candidate model and improve the efficiency compared with the traditional DMU algorithm. This research work should be valuable in the research and application of I-DIDs.

关 键 词:多AGENT决策 交互式动态影响图 行为等价 近似行为等价 Kullback—Leibler(KL)距离 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象