提取方正排版文件广义元数据并生成全文HTML的探索被引量：5

Study on general metadata extraction from Founder typesetting files and generating the full text of HTML

出　　处：《中国科技期刊研究》2016年第2期202-206,共5页Chinese Journal of Scientific and Technical Periodicals

基　　金：辽宁省社会科学规划基金资助项目(L12DXW011)

摘　　要：【目的】实现自动提取科技期刊全文元数据并生成HTML文件。【方法】以方正排版文件为对象,在可以提取出来文章的标题、摘要等元数据的基础上,将文章的正文内容元数据化,提出了包含图、表、公式等的广义元数据概念,并建立了提取图、表元数据的提取规则,同时将方正排版数学公式转化为La Te X表达式。然后利用VB编程软件编写了自动提取广义元数据程序并将元数据重新组合生成HTML格式的文件。【结果】根据方正BD排版语言的特点,建立的提取规则能有效提取全文并元数据化,最后可直接生成HTML文件。【结论】实际应用表明了利用广义元数据生成HTML文件的有效性和可行性。[Purposes] This paper aims to automatically extract full text metadata from the journals of science and technology and generate HTML files. [Methods] Taking Founder typesetting files as the object,and on the basis of extracting metadata such as titles and abstracts,we transfer the contents into metadata. And the concept of general metadata（ GM） is proposed,which includes the graph,table and formula metadata. The extraction rules of the graph and table metadata are established,and the transformation from the Founder formula to La Te X is proposed. Then,the VB programming software is programmed to extract the GM. We combine GMto generate the HTML full text file. [Findings] According to the characteristics of the BD typesetting language,the extraction rules can extract the full text metadata effectively,and the HTML file can be generated directly. [Conclusions] The practical application shows the effectiveness and feasibility of using the general metadata to generate HTML files.

关键词：广义元数据方正BD排版语言 VB编程软件自动全文提取 HTML文件

分类号：G230.7[文化科学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

提取方正排版文件广义元数据并生成全文HTML的探索被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

提取方正排版文件广义元数据并生成全文HTML的探索 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

提取方正排版文件广义元数据并生成全文HTML的探索被引量：5