Course: Information Retrieval

» List of faculties » FAV » KIV
Course title Information Retrieval
Course code KIV/IR
Organizational form of instruction Lecture + Tutorial
Level of course Master
Year of study not specified
Semester Summer
Number of ECTS credits 6
Language of instruction Czech
Status of course Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Mrkvička Miroslav, doc. Ing. Ph.D.
Course content
1. Motivation and program of the course. Introduction. 2. Information extraction from the Web, web crawlers. 3. Tokenization, stemming, Porter stemmer, lemmatization, POS tagging, parsing. Dictionaries, edit distance. 4. Information retrieval, Boolean model, indexing. 5. Query and document similarity, vector space model, top hits selection. 6. The Web as a graph, link analysis, PageRank, HITS. 7. Evaluation of an IR system, standard evaluation corpora, evaluation of relevance. 8. XML retrieval, vector space model for XML retrieval. 9. Question answering. 10. Multimedia information retrival. 11. Text classification, feature selection, classification evaluation, classification in the vector space model. Detection of plagiarism, spams. 12. Text clustering, determining the number of clusters. News clustering systems. 13. Introduction to text analysis - Information extraction, text summarization, opinion mining.

Learning activities and teaching methods
Lecture supplemented with a discussion, Project-based instruction, Discussion, Multimedia supported teaching, Students' portfolio, Skills demonstration, Task-based study method, Individual study, Textual studies, Practicum
  • Individual project (40) - 40 hours per semester
  • Contact hours - 65 hours per semester
  • Preparation for an examination (30-60) - 55 hours per semester
prerequisite
Knowledge
get a better understanding of the possibilities of application software with the aim of better processing the increasing amount of data
explain the principles of relational databases, data integrity and basic SQL statements; describe data modelling approaches
describe the principles of procedural and object-oriented programming languages including the basic control structures and data representation forms, explain the fundamental data structures and algorithms to work with them
Skills
sort, process and present the acquired information in both written and oral forms in Czech and English; produce documentation of the implemented oeuvre or its components
získávat a zpracovávat informace ze zdrojů v anglickém jazyce
deign a small- to middle-size database or information system; design and implement a simple stand-alone web application
master the principles of the creation of well documented and robust programming code; make use of the theoretical as well as practical knowledge of algorithms, data structures and specific developer tools
Competences
N/A
learning outcomes
Knowledge
explain and illustrate the methods and models for the representation and processing of large-scale unstructured data
describe the principles of natural language processing and of textual data search
Skills
make efficient use of the methods and technologies for the search in large-scale unstructured data
implement various web search methods and basic natural language processing techniques
Competences
N/A
N/A
make use of one's professional knowledge, skills, and general abilities in English and, to some extent, also in one other foreign language
teaching methods
Knowledge
Lecture supplemented with a discussion
Self-study of literature
Practicum
Individual study
Task-based study method
Multimedia supported teaching
Skills
Skills demonstration
Competences
Lecture supplemented with a discussion
assessment methods
Knowledge
Individual presentation at a seminar
Continuous assessment
Test
Combined exam
Skills
Project
Skills demonstration during practicum
Combined exam
Competences
Combined exam
Recommended literature
  • Baeza-Yates, R.; Ribeiro-Neto, Berthier. Modern information retrieval. Harlow : Addison-Wesley, 1999. ISBN 0-201-39829-X.
  • Büttcher, Stefan.; Clarke, Charles L. A.; Cormack, Gordon V. Information Retrieval: Implementing and Evaluating Search Engines. Cambridge: The MIT Press, 2016. ISBN 978-0-262-52887-0.
  • Chakrabarti, Soumen. Mining the web : discovering knowledge from hypertext data. San Francisco : Morgan Kaufmann Publishers, 2003. ISBN 1-55860-754-7.
  • Jurafsky, Daniel; Martin, James H. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. 2nd ed. Upper Saddle River : Pearson/Prentice Hall, 2009. ISBN 978-0-13-504196-3.
  • Manning, Christopher D.; Raghavan, Prabhakar; Schütze, Hinrich. Introduction to information retrieval. 1st pub. New York : Cambridge University Press, 2008. ISBN 978-0-521-86571-5.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester