Chinese Person and Organization Entity Names Recognition
Based on Conceptual Relationship Knowledge
Jia Ning(Signal and Information Processing)
Directed by Zhang Quan
name recognition is a basic problem in Natural Language Processing. It is widely
used in Information Extraction, Information Retrieval, Q&A and Machine
Translation. As entities names are large number and with various structures, the
automatic recognition is a valuable researching field.
dissertation focuses on person and organization names recognition. This
dissertation presents method based on conceptual relationship knowledge. Person
and organization names are tag of language space. Their tag in conceptual
language space is ¡®pp¡¯ tag. The sentence category knowledge and domain
sentence category knowledge contains relationship between semantic chunks and
anticipation of semantic chunks¡¯ concept. Using the two kinds of knowledge
will extract the semantic chunks which contains ¡®pp¡¯. After analyze
structure of the semantic chunk, the position of ¡®pp¡¯ in semantic chunk will
be found. Then, recognition arithmetic extracts person and organization names
from semantic chunk.
main points of the contribution in this dissertation are listed following:
Presented a method for
person and organization names recognition based on sentence category analysis
and domain sentence category. The method includes three steps. First, we extract
semantic chunks which contain ¡®pp¡¯ by using semantic chunk relationship
rules. Second, we analysis the structure of semantic chunks which are extracted
in step1, and extract the parts which contain ¡®pp¡¯. Third, we recognize
person and organization names from the parts in step2. The experiment shows that
the method gets precision more than 99% for extraction of semantic chunk
containing ¡®pp¡¯ concept.
relationship knowledge in sentence category space and HNC knowledge database.
Designed semantic chunk relationship rules of conceptual layer and lexical
layer. Found semantic chunk relationship rules database aimed at ¡®pp¡¯
concept. The experiment shows that the semantic chunk relationship rules are
effective for extracting semantic chunks which contain ¡®pp¡¯ concept from
Found the mapping
between domain sentence category space and sentence category space. The
corresponding between two spaces includes obvious corresponding and unobvious
corresponding. For obvious corresponding, the mapping is found by classify
semantic chunk of domain sentence category and sentence category. So,
anticipation for semantic chunk of domain sentence category can be used for
semantic chunk of sentence category by the mapping.
principle of object-content structure in GBK without sentence ecdysis. Designed
method for object-content decomposition in GBK without sentence degeneration
into chunk. There are two pivotal problems for object-content decomposition. The
one is whether GBK¡¯s structure is object-content structure. Another problem is
judge which part is object and which part is content. This dissertation resolved
the two problems and designed rules for all GBK in basic sentence category.
caused by semantic chunk share between sentences, especially ellipsis of
¡®pp¡¯ concept. This dissertation presented method for ellipsis resolution
with relationship between sentences and analyze for semantic chunk structure.
The experiment shows that the method can resolve ellipsis caused by full
semantic chunk share exactly, and resolve the one caused by partial semantic
chunk share effectively.
summary, based on the HNC theory frame, this dissertation presents the method
for person and organization names recognition based on sentence category
analysis, domain sentence category and analysis for semantic chunk structure.
Furthermore, this dissertation studied resolution for several problems in HNC
theory, such as object-content decomposition in GBK, semantic relationship
knowledge of conceptual layer and lexical layer, resolving of ¡®pp¡¯
concept¡¯s ellipsis, etc. The studies in this dissertation reinforced
practicability of HNC theory and provided a new approach for HNC theory¡¯s
Key words: HNC; Person Name Recognition; Organization Name Recognition; Conceptual Relationship Knowledge