|
|
Main menu for Browse IS/STAG
Course info
KIV / IR-E
:
Course description
Department/Unit / Abbreviation
|
KIV
/
IR-E
|
Academic Year
|
2024/2025
|
Academic Year
|
2024/2025
|
Title
|
Information Retrieval
|
Form of course completion
|
Exam
|
Form of course completion
|
Exam
|
Accredited / Credits
|
Yes,
6
Cred.
|
Type of completion
|
Combined
|
Type of completion
|
Combined
|
Time requirements
|
Lecture
3
[Hours/Week]
Tutorial
2
[Hours/Week]
|
Course credit prior to examination
|
Yes
|
Course credit prior to examination
|
Yes
|
Automatic acceptance of credit before examination
|
No
|
Included in study average
|
YES
|
Language of instruction
|
English
|
Occ/max
|
|
|
|
Automatic acceptance of credit before examination
|
No
|
Summer semester
|
0 / -
|
0 / -
|
0 / -
|
Included in study average
|
YES
|
Winter semester
|
0 / -
|
0 / -
|
0 / -
|
Repeated registration
|
NO
|
Repeated registration
|
NO
|
Timetable
|
Yes
|
Semester taught
|
Summer semester
|
Semester taught
|
Summer semester
|
Minimum (B + C) students
|
1
|
Optional course |
Yes
|
Optional course
|
Yes
|
Language of instruction
|
English
|
Internship duration
|
0
|
No. of hours of on-premise lessons |
|
Evaluation scale |
1|2|3|4 |
Periodicity |
každý rok
|
Evaluation scale for credit before examination |
S|N |
Periodicita upřesnění |
|
Fundamental theoretical course |
No
|
Fundamental course |
No
|
Fundamental theoretical course |
No
|
Evaluation scale |
1|2|3|4 |
Evaluation scale for credit before examination |
S|N |
Substituted course
|
None
|
Preclusive courses
|
KIV/IR
|
Prerequisite courses
|
N/A
|
Informally recommended courses
|
KIV/IDT or KIV/PT
|
Courses depending on this Course
|
KIV/NLP
|
Histogram of students' grades over the years:
Graphic PNG
,
XLS
|
Course objectives:
|
To give students detailed knowledge how to create complex software for processing texts in natural language.
|
Requirements on student
|
Elaboration and defence of a semestrial software project (receiving at least 50% from possible points), receive at least 50% points earned by active participation at exercise classes, receive at least 50% points for control test, and at least 50% from possible points of the examination test.
|
Content
|
1. Taxonomy of the natural language processing tasks. Typical problems and applications.
2. Tokenization, stemming, Porter?s algorithm, lemmatization, POS tagging, parsing. Dictionaries, edit distance.
3. Information retrieval, Boolean model, indexing.
4. Query and document similarity, vector space model, top hits selection.
5. Evaluation of an IR system, standard evaluation corpora.
6. XML retrieval, vector space model for XML retrieval, evaluation of relevance.
7. Probabilistic information retrieval. Matrix decompositions, latent semantic indexing.
8. Text classification, feature selection, classification evaluation, classification in the vector space model. Detection of plagiarism, spams.
9. Text clustering, determining the number of clusters. News clustering systems.
10.Information extraction, event extraction, relation extraction.
11.Text summarization, text generation.
12.Opinion mining. Application on social media texts.
13.Web mining, content analysis, web crawling, distributed indexes, the Web as a graph, link analysis, PageRank, HITS.
|
Activities
|
|
Fields of study
|
Studentům jsou k dispozici ke stažení jednotlivé přednášky ve formátu PDF a také podstatná část přednášek v ucelené podobě jako elektronická kniha (PDF).
|
Guarantors and lecturers
|
|
Literature
|
-
Basic:
Manning, Christopher D.; Raghavan, Prabhakar; Schütze, Hinrich. Introduction to information retrieval. 1st pub. New York : Cambridge University Press, 2008. ISBN 978-0-521-86571-5.
-
Recommended:
Baeza-Yates, R.; Ribeiro-Neto, Berthier. Modern information retrieval. Harlow : Addison-Wesley, 1999. ISBN 0-201-39829-X.
-
Recommended:
Jurafsky, Daniel; Martin, James H. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. 2nd ed. Upper Saddle River : Pearson/Prentice Hall, 2009. ISBN 978-0-13-504196-3.
-
On-line library catalogues
|
Time requirements
|
All forms of study
|
Activities
|
Time requirements for activity [h]
|
Individual project (40)
|
40
|
Preparation for an examination (30-60)
|
55
|
Contact hours
|
65
|
Total
|
160
|
|
Prerequisites
|
Knowledge - students are expected to possess the following knowledge before the course commences to finish it successfully: |
navigate the possibilities of application software in order to achieve better orientation in the growing amount of information |
describe the principles of programming in imperative and object languages, including basic control structures and methods of data representation, explain basic data structures and algorithms for working with them |
explain the principles of relational databases, data integrity and basic SQL commands, describe data modeling procedures |
Skills - students are expected to possess the following skills before the course commences to finish it successfully: |
design a database system or information system of small to medium scale, design and implement a simpler stand-alone and web application |
master the principles of creating well-documented and robust program codes, practically use theoretical and practical knowledge about working with algorithms, data structures and specific development tools |
sort, process and present the obtained information in written and oral form in English; create documentation for the realized part or its part |
obtain and process information from sources in the English language |
Competences - students are expected to possess the following competences before the course commences to finish it successfully: |
N/A |
|
Learning outcomes
|
Knowledge - knowledge resulting from the course: |
describe the principles of natural language processing and searching in textual data |
explain and illustrate methods and models for representing and processing large unstructured data |
Skills - skills resulting from the course: |
effectively use methods and technologies for searching large unstructured data |
implement various web search methods and basic natural language processing methods |
Competences - competences resulting from the course: |
N/A |
Going through this course the student gains not only the abilities to implement various natural language processing methods but he also gains professional knowledge about their use in the area of software engineering, business intelligence, social media monitoring, frauds discovery, detection of dangerous texts and opinions, sentiment analysis etc. He gains the ability to employ formal methods for the construction of such software. |
|
Assessment methods
|
Knowledge - knowledge achieved by taking this course are verified by the following means: |
Individual presentation at a seminar |
Continuous assessment |
Test |
Combined exam |
Skills - skills achieved by taking this course are verified by the following means: |
Project |
Skills demonstration during practicum |
Combined exam |
Competences - competence achieved by taking this course are verified by the following means: |
Combined exam |
Individual presentation at a seminar |
|
Teaching methods
|
Knowledge - the following training methods are used to achieve the required knowledge: |
Practicum |
Lecture supplemented with a discussion |
Task-based study method |
Individual study |
Self-study of literature |
Multimedia supported teaching |
Skills - the following training methods are used to achieve the required skills: |
Skills demonstration |
Competences - the following training methods are used to achieve the required competences: |
Lecture supplemented with a discussion |
|
|
|
|