fullscreen background
Skip to main content

Winter Quarter

Winter Catalogues
Now Available
Registration Opens Nov 30
shopping cart icon0


« Back to Professional & Personal Development

CS 67 W — An Introduction to Natural Language Processing (NLP) with Python

Quarter: Winter
Course Format: Flex Online (About Formats)
Duration: 8 weeks
Date(s): Jan 25—Mar 19
Drop Deadline: Jan 28
Unit: 1
Tuition: $545
Instructor(s): Oleg Melnikov
Limit: 35
Status: Registration opens Nov 30, 8:30 am (PT)
Please Note: Some of our refund deadlines have changed. See this course's drop deadline above and click here for the full policy.
Flex Online(About Formats)
Jan 25—Mar 19
8 weeks
Drop By
Jan 28
1 Unit
Oleg Melnikov
Registration opens Nov 30, 8:30 am (PT)
Please Note: Some of our refund deadlines have changed. See this course's drop deadline above and click here for the full policy.
Most technology companies that track some level of human interaction collect and process vast amounts of textual data. How do such enterprises organize and make sense of these enormous troves of words and expressions? The term for this emerging field of machine learning is natural language processing (NLP). NLP professionals are eagerly sought after in today’s tech job market, helping companies to process large textual data sets quickly and to summarize, visualize, and digitize these data so as to reduce noise, highlight signals, and reveal insights. This course offers a comprehensive introduction to NLP terminology as well as an introduction to Python and other necessary tools of the NLP professional. Along the way, we will cover problems of text pre-processing, feature extraction, text classification, summarization, document clustering, sentiment analysis, and word vector representation. Students will develop intuition and skills for determining the correct NLP tool for the problem at hand and corresponding evaluation metrics to gauge and communicate their results to technical and nontechnical audiences. In weekly assignments, students will learn to apply NLP concepts in hands-on activities using Python, relevant libraries, and the Jupyter Notebook. Finally, each student will prepare and deliver a short presentation meant to prepare them for real-world job interviews, which will be evaluated and critiqued by the instructor and the class.

Students are expected to have intermediate Python programming skills and a basic understanding of college-level math and matrix algebra. Those with stronger preparation and some familiarity with natural language processing will be able to sharpen and deepen their expertise in this course. Before enrolling, students are encouraged to complete a self-evaluation of preparedness. To access the self-evaluation, click on the preliminary syllabus link. Note: A lower preparedness score will require more study time.

Oleg Melnikov, Lecturer, School of Information, UC Berkeley; Senior Director of Data Science, ShareThis

Oleg Melnikov has been developing and teaching numerous courses in statistics, machine learning, natural language processing, deep learning, Python, R, web development with SQL/PHP, deep neural networks, and quantitative finance at Stanford, UC Berkeley, Cornell, University of Chicago, University of Washington, Johns Hopkins, Higher School of Economics, Rice, UC Irvine, and others. He received a PhD in statistics and MS degrees in computer science, mathematics, statistics, business, and finance.

Textbooks for this course:

(Required) Dipanjan Sarkar, Text Analytics with Python. A Practitioner’s Guide to Natural Language Processing, 2nd Ed. (ISBN 978-1484243534)