教學大綱表
請遵守智慧財產權,勿使用非法影印教科書,避免觸法。
課程名稱 (中文) 自然語言處理
(英文) Natural Language Processing
開課單位 資訊工程研究所
課程代碼 I5840A
授課教師 葉慶隆
學分數 3.0 必/選修 選修 開課年級 研究所
先修科目或先備能力:程式設計、基礎數學(微積分、機率、離散數學)
課程概述與目標: 自然語言處理和電腦視覺是人工智慧做為智慧型代理人(Intelligent Agent)的溝通及感知的主要能力,資料驅動的深度學習技術是促成電腦這些能力的重要基礎。近來深度學習技術已經成為實現溝通及感知的基礎建設,並且廣泛整合在課程中。

一直以來人工智慧是電腦科學裡最複雜的領域之一,學生需要相當紮實的資訊及數學基礎,才能學好這門課。這門課將針對只具備基本程式設計能力,及基礎數學的學生,以問題導向為主軸,結合各種深度學習技術,引領學生認識自然語言處理。本課程將依以下順序進行:

自然語言處理介紹
基礎字串處理:Regular Expressions, Text Normalization, Edit Distance
以邏輯斯分類(logistic classification)為基礎的情感分析(sentiment analysis)
以Naive Bayes 為基礎的情感分析
向量空間模型
向量語意及詞嵌入(Vector Semantics and Word Embedding)
以神經網路為基礎的情感分析
遞迴神經網路(RNN)和語言模型(language modeling)
序列對序列神經網路及機器翻譯
自然語言處理與關注模型(NLP and Attention Model)
教科書 Dan Jurafsky and James H. Martin, Speech and Language Processing, 3rd ed.
參考教材 Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016
Tensorflow Tutorial, tensorflow.org
Charles Severance, Python for Everybody
Introduction to AI, udacity.com
AI for Everyone, coursera.org
Introduction to TensorFlow for Deep Learning, udacity.com
課程大綱 學生學習目標 單元學習活動 學習成效評量 備註
單元主題 內容綱要
1 Welcome and introduction to Natural Language Processing NLP background
Knowledge in Speech and Language Processing
Ambiguity
Models and Algorithms
Language, Thought, and Understanding
Some Brief History
NLP basis  
2 Regular Expressions, Text Normalization, Edit Distance (1) Regular Expressions
Word and Corpora
Word Tokenization
Word Normalization
Regular Expressions, Text Normalization  
3 Regular Expressions, Text Normalization, Edit Distance (2) Python RE
Python NLTK
Python programming: re and nltk  
4 Sentiment Analysis Using Logistic Classification(1) Supervised ML and Sentiment Analysis
Vocabulary & Feature Extraction
Negative and Positive Frequencies
Feature Extraction with Frequencies
Sentiment Analysis Using Logistic Classification  
5 Sentiment Analysis Using Logistic Classification(2) Preprocessing
Putting it All Together
Logistic Regression Overview
Logistic Regression: Training
Logistic Regression: Testing
Sentiment Analysis Using Logistic Classification 專題分組開始。 專題第一階段:python re, nltk; Sentiment Analysis using Logistic Classification  
6 Sentiment Analysis with Naïve Bayes (1) Probability and Bayes’ Rule
Naïve Bayes Introduction
Laplacian Smoothing
Log-Likelihood
Sentiment Analysis with Naïve Bayes  
7 Sentiment Analysis with Naïve Bayes (2) Training Naïve Bayes
Testing Naïve Bayes
Applications of Naïve Bayes
Naïve Bayes Assumptions
Sentiment Analysis with Naïve Bayes 專題分組開始。 專題第二階段:Sentiment Analysis using Naive Bayes Method  
8 Vector Space Models Intro. to Vector Space Models
Word by Word and Word by Doc.
Euclidean Distance
Cosine Similarity
Manipulating Words in Vector Spaces
Visualization and PCA
Vector Space Models  
9 期中考週 期中測驗 期中測驗
  • 測驗
  •  
    10 Vector Semantics and Embeddings (1) Overview
    Basic word Representation
    Word Embeddings
    How to Create Word Embeddings
    Word Embedding Methods
    Continuous Bag-of-Words Model
    Vector Semantics and Embeddings
  • 講授
  • 媒體教學
  • 上機實習
  •  
    11 Vector Semantics and Embeddings (2) Cleaning and Tokenization
    Sliding Window of Words in Python
    Transforming Words into Vectors
    Architecture of the CBOW Model
    Architecture of the CBOW Model: Activation Functions
    Vector Semantics and Embeddings  
    12 Vector Semantics and Embeddings (3) Training a CBOW Model
    Extracting Word Embedding Vectors
    Evaluating Word Embeddings
    Examples of Word Embedding
    Vector Semantics and Embeddings  
    13 Neural Networks for Sentiment Analysis Neural Networks and Forward Propagation
    Trax: Neural Networks Trax: Layers
    Dense and ReLU Layers
    Serial Layer
    Other Layers
    Traiining
    Neural Networks for Sentiment Analysis  
    14 Recurrent Neural Networks for Language Modeling Traditional Language Models
    Recurrent Neural Networks
    Applications of RNNs
    Math in Simple RNNs
    Cost Function for RNNs
    Gated Recurrent Units
    Recurrent Neural Networks for Language Modeling  
    15 Neural Machine Translation and Transformer Seq2Seq
    Alignment
    Attention
    Setup for Machine Translation
    Introduction to Transformer
    Neural Machine Translation and Transformer  
    16 期末報告 (1) 期末報告 期末報告  
    17 期末報告(2) 期末報告 期末報告  
    18 期末考週 期末測驗 期末測驗  

    教學要點概述:
    教材編選: □ 自編教材 □ 教科書作者提供
    評量方法: :20%   作業:30%   測驗:30%   專題:20%  
    教學資源: ■ 教材電子檔 ■ 課程網站
    扣考規定:http://eboard.ttu.edu.tw/ttuwebpost/showcontent-news.php?id=504