National Taiwan Normal University Course Outline
Fall , 2019

@尊重智慧財產權,請同學勿隨意影印教科書 。
Please respect the intellectual property rights, and shall not copy the textbooks arbitrarily.

I.Course information
Serial No. 0540 Course Level Master
Course Code ISM0800 Chinese Course Name 資訊檢索原理與應用
Course Name Principles and Applications of Information Retrieval
Department Graduate Institute of Library and Information Studies
Two/one semester 1 Req. / Sel. Sel.
Credits 3.0 Lecturing hours Lecture hours: 3
Prerequisite Course ◎1. This is a cross-level course and is available for junior and senior undergraduate students, master's students and PhD students. 2. If the listed course is a doctroal level course, it is only available for master's students and PhD students.
Comment GLIS Seminar Room A
Course Description
Time / Location Wed. 7-9 Main 11111
Curriculum Goals Corresponding to the Departmental Core Goal
1. 瞭解資訊檢索之重要相關理論‧ 瞭解資訊使用者之檢索行為模式‧ 具備資訊檢索系統之實作與評鑑技能‧ 掌握資訊檢索研究發展趨勢 Master:
 1-3 Explore information user and the theory and methodology of information use
 2-1 Ability to analyze and solve problems
 2-2 Ability to manage and utilize digital and Internet technology
 2-3 Ability to plan and evaluate information system
 4-1 Develop people-centered thinking and value knowledge. Facilitate the freedom and the usability of knowledge with information technology and innovative services

II. General Syllabus
Instructor(s) 曾元顯
Schedule

週次

課程內容

備註

Course Overview

 

Search Engines and Information Retrieval

Chap.1

Architecture of a Search Engine

Chap. 2

系統實作指導 I

 

Crawls and Feeds; Crawler Implementation

 Chap. 3

Processing Text, Term Extraction, Word Embedding

 Chap. 4

系統實作指導 II

 

Ranking with Indexes, Page Ranking

Chap. 5

Queries and Interfaces, Chatbot Q&A Systems

Chap. 6

Retrieval Models, Knowledge Graph Reasoning

Chap. 7

十一

系統實作指導 III

 

十二

Evaluating Search Engines, Evaluation Metrics

Chap. 8

十三

Classification and Clustering, Machine Learning

Chap. 9

十四

Social Search and Human Factors

Chap. 10

十五

Beyond Bag of Words, Deep Learning

Chap. 11

十六

演講:User Information Retrieval Behavior

 

十七

演講:Mobile User Interface

 

十八

期末系統展示

 

Lecturing Methodologies
Methods Notes
Formal lecture 以投影片講述課程內容
Group discussion 以即時回應系統(IRS:http://pro.ccr.tw/)詢問同學反應,並激發討論
Lab/Studio 給予搜尋引擎、文件分類、主題歸類等工具,請同學安裝使用
Other: 論文研讀
Grading assessment
Methods Percentage Notes
Assignments 20 % 會有幾次作業,如安裝相關系統並呈現結果、線上隨堂考、線上遠距小考
Class discussion involvement 10 % 課堂發問、回應、討論情形
Attendances 10 % 課堂出席情形
Presentation 30 % 依每次上課指定閱讀章節內容,進行口頭報告及討論
Case study reports 30 % 課程中將教授相關Open Source Software。每位同學需規劃一主題,利用所學習到的工具,實際完成一小型網路搜尋引擎、文件分類、主題歸類、自動摘要、文本生成,並於期末展示成果。
Required and Recommended Texts/Readings with References

指定閱讀

Croft, B., Metzler, D., & Strohman, T. (2015). Search Engines: Information Retrieval in Practice. Addison-Wesley. https://ciir.cs.umass.edu/irbook/.

 

BOOKS

General IR (CS)

  1. Stefan Büttcher, Charles L. A. Clarke, & Gordon V. Cormack (2016). Information Retrieval: Implementing and Evaluating Search Engines, MIT Press.
  2. Baeza-Yates, R., & Ribeiro-Neto, B. (2011). Modern Information Retrieval: The Concepts and Technology behind Search, 2nd ed. Addison-Wesley.Chowdhurry, G.G. (2010). Introduction to Modern Information Retrieval, 3rd ed. Neal-Schuman.
  3. Grossman, D.A., & Frieder, O. (2004). Information Retrieval: Algorithms and Heuristics. 2nd ed. Springer.
  4. Hearst, M.A. (2009). Search User Interfaces. Cambridge University Press.
  5. Manning, C.D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press. (http://nlp.stanford.edu/IR-book/)
  6. Salton, G. (1983). Introduction to Modern Information Retrieval. McGraw-Hill.
  7. Sparck Jones, K., & Willett, P. (1997). Readings in Information Retrieval. Morgan Kaufmann.
  8. van Rijsbergen, C.J. (1979). Information Retrieval. Butterworths.

 

General (LIS)

  1. Goker, A., & Davies, J. (ed) (2009). Information Retrieval: Searching in the 21st Century. Wiley.
  2. Harter, S.P. (1986). Online Information Retrieval: Concepts, Principles and Techniques. Academic Press.
  3. Hunter, E.J. (2009). Classification Made Simple: An Introduction to Knowledge Organisation and Information Retrieval, 3rd ed. Ashgate.
  4. Lancaster, F.W., & Warner, A.J. (1993). Information Retrieval Today. Info Resources Press.
  5. Saracevic, T., & Marchionini, G., (ed) (2012). Relevance in Information Retrieval. Morgan & Claypool.

 

Search Engines

  1. Langville, A.N., & Meyer, C.D. (2012). Google`s PageRank and Beyond: the Science of Search Engine Rankings. Princeton University Press.
  2. Battelle, J. (2005). The Search: How Google and its Rivals Rewrote the Rules of Business and Transformed Our Culture. Nicholas Brealey.

 

Search Engine Optimization (SEO)

  1. Kennedy, A.F., & Hauksson, K.M. (2012). Global Search Engine Marketing: Getting Better International Search Engine Results. Que.
  2. Enge, E., et al. (2009). The Art of SEO: Mastering Search Engine Optimization. O`Reilly.
  3. Lieb, R. (2009). The Truth about Search Engine Optimization. FT Press.

 

Cross-Language IR

  1. Peters, C., Braschler, M., & Clough, P. (2012). Multilingual Information Retrieval: from Research to Practice. Springer.

 

Multimedia IR

  1. Müller, M. (2007). Information Retrieval for Music and Motion. Springer.
  2. Ras, Z.W., & Wieczorkowska, A. (ed) (2010). Advances in Music Information Retrieval. Springer.

 

Web Mining

  1. Kaushik, A. (2009). Web Analytics 2.0: The Art of Online Accountability and Science of Customer Centricity. Sybex.
  2. Chakrabarti, S. (2002). Mining the Web: Analysis of Hypertext and Semi Structured Data. Morgan Kaufmann.
  3. Liu, B. (2011). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2nd ed. Springer.
  4. Web NER ToolKit (2019, DS4NER), https://sites.google.com/site/nculab/projects/web-ner-tool

 

Search as Learning

  1. Vakkari, P. (2016). Searching as Learning. A Systematization based on Literature. Journal of Information Science 42(1): 7-18.
  2. Carsten Eickhoff, Jacek Gwizdka, Claudia Hauff, Jiyin He (2017). Introduction to the special issue on search as learning. https://link.springer.com/article/10.1007/s10791-017-9315-9
  3. Search as Learning at CIKM 2018 - Claudia Hauff, https://chauff.github.io/2018-08-07-sal-at-cikm/

 

Semantic Web

  1. Davies, J., Studer, R., & Warren, P. (ed) (2006). Semantic Web Technologies: Trends and Research in Ontology-based Systems. Wiley.
  2. Allemang, D., & Hendler, J. (2011). Semantic Web for the Working Ontologist, 2nd ed. Morgan Kaufmann.

 

User Behavior

  1. Ingwersen, P., & Järvelin, K. (2005). The Turn: Integration of Information Seeking and Retrieval in Context. Springer.
  2. Ruthven, I., &, Kelly, D. (ed) (2011). Interactive Information Seeking, Behaviour and Retrieval. Facet.
  3. Spink, A., & Jansen, B.J. (2004). Web Search: Public Searching of the Web. Springer.
  4. Warner, J. (2009). Human Information Retrieval. The MIT Press.

 

Socio-Technical Aspects

  1. Brin, D. (1998). The Transparent Society: Will Technology Force Us to Choose Between Privacy and Freedom? Basic Books.
  2. Huberman, B.A. (2001). The Laws of the Web: Patterns in the Ecology of Information. MIT Press.
  3. Lesser, E.L., Fontaine, M.A., & Slusher, J.A., eds. (2000). Knowledge and Communities. Butterworth-Heinemann.
  4. Lessig, L. (1999). Code and Other Laws of Cyberspace. Basic Books.
  5. 吳世弘 (2017) “The CYUT System on Social Book Search Track since INEX 2013 to CLEF 2016”, 圖書館學與資訊科學, 43(2), pp. 6-19. http://140.122.104.2/ojs../index.php/jlis/article/view/733 

 

Neural Approaches to Information Retrieval

  1. (Word Embedding) Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Paper presented at the Advances in neural information processing systems.
  2. Word embeddings in 2017: Trends and future directions: http://ruder.io/word-embeddings-2017/ (OOV handling, Subword-level embeddings, Multi-sense embeddings, Phrases and multi-word expressions).
  3. (fastText from Facebook) Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classification. CoRR, abs/1607.01759.
  4. (Deep Learning Model from Google) Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv. https://arxiv.org/pdf/1810.04805.pdf
  5. (GT2: Deep Learning Model from OpenAI) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners. Retrieved from https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
  6. (XLNet) Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. https://arxiv.org/abs/1906.08237.

 

CONFERENCES

  1. ACM SIGIR Annual Conference. http://www.acm.org/sigir/
  2. ASIS&T Annual Conference. http://www.asis.org/
  3. JCDL (Joint Conference on Digital Libraries). http://www.jcdl.org/
  4. TREC (Text REtrieval Conference). http://trec.nist.gov/
  5. NTCIR, http://research.nii.ac.jp/ntcir/index-en.html
  6. WWW Annual Conference. http://www.iw3c2.org/

 

JOURNALS

D-Lib Magazine

Information Processing and Management (IP&M)

Information Research

Journal of the American Society for Information Science and Technology (JASIST)

Journal of Documentation (JDoc)

 

WEB RESOURCES

  1. ACM SIGIR Information Retrieval Resources http://www.sigir.org/resources.html
  2. 鄭卜壬、李家豪(2007)。數位典藏技術導論。第四章 資訊檢索技術http://ebook.iis.sinica.edu.tw/pdf/ch4_InformationRetrieval.pdf
  3. 中央研究院(2007)。數位典藏技術導論。http://ebook.iis.sinica.edu.tw/
  4. Glasgow IR resources (http://ir.dcs.gla.ac.uk/resources.html)
  5. UCLA Graduate School of Education & Information School (http://polaris.gseis.ucla.edu/jfurner/00-01/273/273res.html)
  6. Information Research Weblog (http://www.free-conversant.com/irweblog/)
  7. Search Engine Meeting Conference (http://www.infonortics.com/searchengines/)
  8. Web IR and IE (http://www.webir.org/)