Lecturer(s)
|
-
Knapek Josef, doc. Ing. Ph.D.
|
Course content
|
1. Taxonomy of the natural language processing tasks. Typical problems and applications. 2. Tokenization, stemming, Porter?s algorithm, lemmatization, POS tagging, parsing. Dictionaries, edit distance. 3. Information retrieval, Boolean model, indexing. 4. Query and document similarity, vector space model, top hits selection. 5. Evaluation of an IR system, standard evaluation corpora. 6. XML retrieval, vector space model for XML retrieval, evaluation of relevance. 7. Probabilistic information retrieval. Matrix decompositions, latent semantic indexing. 8. Text classification, feature selection, classification evaluation, classification in the vector space model. Detection of plagiarism, spams. 9. Text clustering, determining the number of clusters. News clustering systems. 10.Information extraction, event extraction, relation extraction. 11.Text summarization, text generation. 12.Opinion mining. Application on social media texts. 13.Web mining, content analysis, web crawling, distributed indexes, the Web as a graph, link analysis, PageRank, HITS.
|
Learning activities and teaching methods
|
Lecture supplemented with a discussion, Project-based instruction, Discussion, Multimedia supported teaching, Students' portfolio, Skills demonstration, Task-based study method, Individual study, Textual studies, Practicum
- Individual project (40)
- 40 hours per semester
- Contact hours
- 65 hours per semester
- Preparation for an examination (30-60)
- 55 hours per semester
|
prerequisite |
---|
Knowledge |
---|
navigate the possibilities of application software in order to achieve better orientation in the growing amount of information |
describe the principles of programming in imperative and object languages, including basic control structures and methods of data representation, explain basic data structures and algorithms for working with them |
explain the principles of relational databases, data integrity and basic SQL commands, describe data modeling procedures |
Skills |
---|
design a database system or information system of small to medium scale, design and implement a simpler stand-alone and web application |
master the principles of creating well-documented and robust program codes, practically use theoretical and practical knowledge about working with algorithms, data structures and specific development tools |
sort, process and present the obtained information in written and oral form in English; create documentation for the realized part or its part |
obtain and process information from sources in the English language |
Competences |
---|
N/A |
learning outcomes |
---|
Knowledge |
---|
describe the principles of natural language processing and searching in textual data |
explain and illustrate methods and models for representing and processing large unstructured data |
Skills |
---|
effectively use methods and technologies for searching large unstructured data |
implement various web search methods and basic natural language processing methods |
Competences |
---|
Going through this course the student gains not only the abilities to implement various natural language processing methods but he also gains professional knowledge about their use in the area of software engineering, business intelligence, social media monitoring, frauds discovery, detection of dangerous texts and opinions, sentiment analysis etc. He gains the ability to employ formal methods for the construction of such software. |
N/A |
teaching methods |
---|
Knowledge |
---|
Practicum |
Lecture supplemented with a discussion |
Task-based study method |
Individual study |
Self-study of literature |
Multimedia supported teaching |
Skills |
---|
Skills demonstration |
Competences |
---|
Lecture supplemented with a discussion |
assessment methods |
---|
Knowledge |
---|
Individual presentation at a seminar |
Continuous assessment |
Test |
Combined exam |
Skills |
---|
Project |
Skills demonstration during practicum |
Combined exam |
Competences |
---|
Combined exam |
Individual presentation at a seminar |
Recommended literature
|
-
Baeza-Yates, R.; Ribeiro-Neto, Berthier. Modern information retrieval. Harlow : Addison-Wesley, 1999. ISBN 0-201-39829-X.
-
Jurafsky, Daniel; Martin, James H. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. 2nd ed. Upper Saddle River : Pearson/Prentice Hall, 2009. ISBN 978-0-13-504196-3.
-
Manning, Christopher D.; Raghavan, Prabhakar; Schütze, Hinrich. Introduction to information retrieval. 1st pub. New York : Cambridge University Press, 2008. ISBN 978-0-521-86571-5.
|