出版時間:2007-11 出版社:商務(wù)印書館 作者:方稱宇 頁數(shù):225
Tag標簽:無
內(nèi)容概要
語料庫語言學和計算語言學為促進自然語言處理技術(shù)快速發(fā)展的兩門基礎(chǔ)學科?!队⒄Z語料庫與自動語法分析》系這兩個領(lǐng)域的一本專著,它以國際英語語料庫為背景,著重探討大型語料庫的語法分析,尤其是英語口語材料給計算機自動處理帶來的一系列難題。書中涉及基于概率的自動詞類識別和基于實例的自動句法分析這兩大技術(shù),并有專門章節(jié)來探討句法分析的評測問題,對AUTASYS和The Survey Parser這兩個軟件系統(tǒng)的實際表現(xiàn)進行了深入的量化評測。此外,本書還探討了介詞短語的自動分析,特別是這類短語的句法功能的自動判定,并對自動語法分析在語音合成及語音識別中的應(yīng)用做了相應(yīng)的說明。
書籍目錄
Preface前言List of FiguresList of TablesAbstract1. Introduction 1.1. What is Parsing? 1.2. The Introspective View 1.3. The Retrospective View 1.4. Data-Oriented Parsing 1.5. General Problems 1.6. The Proposed Research 1.6.1. Background to the Proposed Research 1.6.2. The Basic Approach of the Proposed Research 1.6.3. The Strengths and Novelties of the Proposed Approach 1.6.3.1. Automated Grammar Generation 1.6.3.2. De-Lexicalised Terminal Nodes 1.6.3.3. Global Parse with Subcategorisation Features 1.6.3.4. High-Quality Partial Parse 1.6.3.5. Intrinsic Ability to Learn 1.7. The Organisation of the Book2. The Automatic Analysis of English Word Classes 2.1. An Overview of Word Class Tagging 2.2. Major Word Class Tagging Schemes 2.2.1. The Lancaster-Oslo/Bergen Tagging Scheme 2.2.1.1. The Lancaster-Oslo-Bergen Corpus 2.2.1.2. The Lancaster-Oslo-Bergen Tag Set 2.2.1.3. Summary 2.2.2. The International Corpus of English Tagging Scheme 2.2.2.1. The International Corpus of English 2.2.2.2. The International Corpus of English Tag Set 2.2.3. A Comparison of LOB and ICE 2.3. Word Class Tagging Methodologies 2.3.1. The Rule-Based Approach 2.3.2. The Probabilistic Approach 2.4. AUTASYS: A Hybrid Tagging System 2.4.1. A Probabilistic Approach Using the LOB Tag Set 2.4.1.1. The Tag Assignment Module 2.4.1.1.1. Tokenisation 2.4.1.1.2. The treatment of"." 2.4.1.1.3. The treatment of"'" 2.4.1.1.4. Sentence boundary markers 2.4.1.2. Orthographic Analysis 2.4.1.3. Lexicon Lookup 2.4.1.3.1. The lexicon 2.4.1.3.2. The coverage of the lexicon 2.4.1.4. Morphological Analysis 2.4.2. The Idiom Identification Module 2.4.3. The Probabilistic Tag Selection Module 2.4.3.1. The Bigram Probabilistic Matrix 2.4.3.2. Implementing Probabilistic Tag Selection 2.4.4. The Rule-Based Refinement Module 2.4.5. Empirical Evaluation 2.4.6. Permissive AUTASYS-LOB Disagreements 2.4.6.1. NNP-NPT 2.4.6.2. JJ-JJB 2.4.6.3. NNP-NPL 2.4.6.4. RB-NN 2.4.7. Summary 2.5. A Rule-Based Approach towards LOB to ICE Translation 2.5.1. Solutions for Verbs 2.5.1.1. Auxiliary vs. Lexical 2.5.1.2. Monotransitive vs. Complex Transitive 2.5.1.3. Finite vs. Nonfinite 2.5.2. Closed Sets 2.5.3. Initial Results 2.5.4. Problems 2.5.5. Summary3. The Automatic Induction of a Formal Grammar4. Robust Practical Analogy-Based Parsing5. Extensive Evaluations of the Survey Parser6. The Resolution of Prepositional Phrases7. Conclusions and Further WorkReferencesAppendix A: A List of LOB TagsAppendix B: A List of ICE TagsAppendix C: A List of AUTASYS IdiomsAppendix D: A List of ICE Parsing SymbolsAppendix E: A List of ICE Prepositions in Descending Frequency OrderAppendix F: A Distributional Profile of ICE-GB PrepositionsIndex
編輯推薦
本書的主要思路就是將已經(jīng)分析過的語料庫變成一個句法知識庫,從中提取短語結(jié)構(gòu)語法規(guī)則,并通過基于實例的手段,在知識庫中為待分析語句提取一棵最佳句法樹。本書對上述各個部分的研究進行了詳細的描述,對系統(tǒng)的實際表現(xiàn)進行了深入的量化評測,并有專門章節(jié)來探討句法分析的評測問題。除此之外,還探討了介詞短語的自動分析,特別是這類短語的句法功能的自動判定,因為這一研究和句法相似度分析有著密切的關(guān)系。同時,本書還就自動語法分析在語音合成及語音識別中的應(yīng)用做了相應(yīng)的介紹和說明,希望對讀者能有所幫助。
圖書封面
圖書標簽Tags
無
評論、評分、閱讀與下載