Software Platform for Expanded Sentence Category Analysis
on the HNC Theory
WEI Xiang-feng (Signal and Information Processing)
Directed by ZHANG Quan
Chinese sentences are constructed by the meaning of the words, and not the form of the words. Much obstacle arises in Chinese, such as changeless form of Chinese words, serious flexibility parts of speech(POS), lack of case, plural and tense. From the opinions of the HNC(Hierarchical Network of Concepts) theory, there is only one conceptual language space mapped from more than 6,000 kinds of languages. The conceptual language space includes four layers: concept primitives, sentence categories, context elements and contexts. And the Natural language Understanding(NLU) is a mapping process between the conceptual language space and the real language. In the analysis processing, the method on conceptual structure of a sentence is put forward. It is named Sentence Category Analysis(SCA).
This dissertation aims to construct extended software platform for the analysis of sentence group(SG) based on the HNC theory, study the primary processing of SCA and SG.
Here, the rule method is adopted. The formalization of the rules on each processing phrase of expansible SCA(for SG understanding) is the key handhold through the dissertation. And main strategy of the processing is that the undoubted point are processed firstly, the whole or the part which is processed firstly is decided by the data, and the conceptual sentence structure is acquired step by step.
In order to build an expansible SCA platform, the dissertation mainly includes the following aspects: the design and implementation of an expansible platform based on formal rule, semantic chunk perception and sentence category hypothesis, sentence category test and inner semantic chunk processing, SG structure analysis and the primary acquisition of basic information of context elements.
Based on the previous research of the HNC theory, this dissertation studied the related problems on SCA and SG understanding, and the main contribution shows as the following:
(1) A formal language for describing rules was designed, and it was used for formalization of rules. The researchers can proceeding the rules on the platform without knowing about the details of the program.
(2) Implemented an expansible software platform based on the formal rules. We can execute the formal rules on the platform. The processing ability of the platform may be improved with more researches on this field. This also improves the ability of the SCA.
(3) There is a friendly visual interface of the platform. The results are displayed through graphical user interface. The text mode results can be customized. These is in favor of human-machine intercommunion and program debugging.
(4) The rules of the recognition the phrases of time, space and quantity was studied, and were tested on the platform. The results indicated that the correct rate and recall rate were both high. The ability of the software platform will be improved with the more perfect rules.
(5) The dissertation also studied the rules for the global eigen-chunk perception, especially for the consecutive appearance of two 'v' concepts. The rules was formalized and tested on the platform. The results indicated that the rules are suitable for the processing of SCA. The study improved and enriched the ability of the software platform.
(6) Based on the previous research, the dissertation studied the sharing relationship of the main semantic chunks among the clauses of SG, and discussed the resuming strategy and rules for the omitted main semantic chunk, constructed an academic foundation for the SG structure analysis.
(7) The dissertation summarized the methods of the primary acquisition of basic information of context elements. The methods acquiring in a language segment was explored, which will help to the related processing of multiple language segments.
As summing up the above, this dissertation implemented an extended software platform served to the SCA. The rules of the SCA were tested on the platform. The work of the dissertation provide a platform which is very useful to the breakthrough of the 20 difficulties on SCA and to launch the SG-context-element mapping, and it is the foundation for sentence-SG analysis.
Keywords: Hierarchical Network of Concepts (HNC) theory; rule; sentence category analysis (SCA); sentence group (SG); context