CSE 60556: Large Language Models

Description

This is a graduate-level elective course. It aims at graduate students who are interested in using and/or developing large language model (LLMs) techniques. It is designed for those who have knowledge and programming experience of machine learning. This course introduces tasks and datasets related to language models (LMs), LLM architectures, LLM training techniques, reasoning methods, knowledge augmentation methods, efficient LLM methods, various LLM applications (e.g., assistant, education, healthcare, RecSys, planning), and challenges in LLMs for social good. Specifically, we talk about popular LLM concepts such as Scaling Law, GPT, RLHF, ICL, IFT, CoT, RAG, PEFT, Agent, Hallucination, and Trustworthiness. We will cover each topic and discuss the concepts in depth. Students will be expected to routinely read and present research papers and complete a research project at the end. In the project, students attempt to reimplement and improve upon a research paper in a topic of their choosing.

Instructor

Prerequisites

Received credits with graduate student status from at least one course below: CSE 60625 Machine Learning, CSE 60647 Data Science, CSE 60657 Natural Language Processing, CSE 60868 Neural Networks; or highly-related graduate-level courses at another accredited university.

Course Topics