Welcome to MSMK7025: Algorithms, Big Data and Online Marketplaces!
Class A: Mon & Thu 2:00 - 5:00pm (CP-EFG) Class Days: Dec 5, 8, 12, 15, 19, Jan 5, 9, 12, 16, 19
Class B: Mon & Thu 6:30 - 9:30pm (CP-EFG) Class Days: Dec 5, 8, 12, 15, 19, Jan 5, 9, 12, 16, 19
Class C: Tue & Fri 6:30 - 9:30pm (CP-ABC) Class Days: Dec 6, 9, 13, 16, 20, Jan 3, 6 (Classroom: LTB), 10, 13, 17
Instructor: Xi Li
- You can download the software for the course here: R, R Studio and Tableau Public
- Note: When installing R/Rstudio, make sure your path does not contain any non-english letters. (安装路径必须为纯英文,否则运行会出错)
- No textbooks for the course.
Grading:
10%: Group Paper Presentation (~12 minutes, in class). Details
Selecting your projects here: Class A Class B Class C
40%: Two Group Data Projects (no presentation)
Deadline for project I: Class A (Dec 29, 5:00 pm), Class B (Dec 29, 9:30pm), and Class C (Dec 30, 9:30pm)
Deadline for Project II: Class A (Jan 19, 5:00 pm), Class B (Jan 19, 9:30pm), and Class C (Jan 17, 9:30pm)
20%: In-class Participation
30%: Individual Assignment (no presentation)
Deadline for Individual Assignment: Class A (Jan 19, 5:00 pm), Class B (Jan 19, 9:30pm), and Class C (Jan 17, 9:30pm)
Lecture 1: Introduction
Slides (With Answer Keys)
Lecture 2: Introduction to R
Note: Please get R and Tableau Public installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: A beginner’s guide to R
Troubleshooting for package installation: Here
Additional Tips for summarizing data (in Chinese): link Thanks Sheng for sharing it!
Lecture 3: Data Visualization with Tableau
Note: Please get R and Tableau Public installed on your laptop and bring it with you.
Slides
Data files: Superstore Data, Movie Data Tableau Public Gallery
Example of Word Cloud: Word Cloud Tagul for generating Word Cloud: Tagul
Templates for Word Cloud: Lion Trump
Lecture 4: Beyond Linear Regression
Slides (With Answer Keys)
Lecture 5: Data Workshop I
Slides (With Answer Keys)
Dataset: CSV
Sample Data Analysis with R: Sample
Lecture 6: Data Scraping
Slides (With Answer Keys)
Teaching materials: Scraper (HKU), Scaper (Marketing Science Journal), Scraper (Harvard), Scraper (Photo)
Lecture 7: Data Workshop II
Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: Quadratic Regression, Interactions and Fixed Effects
Review Dataset: XLSX, CSV
Sample Data Analysis with R for online reviews: Sample
Property Dataset: Please access the online platform here
Sample code for data analysis
General Information about property valuation in HK
User Manual for the online platform
Variable List
Lecture 8: Text Analysis
Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Codes: text analysis, topic models
A Chinese sentiment lexicon here
Demonstration of sentiment analysis.
Demonstration of LDA. Data files: document and stopword
Lecture 9: Causality
Slides (With Answer Keys)
Optional reading: a discussion of some famous instruments (in simplified Chinese) here
Optional reading: the 2021 Nobel Prize in Economics here for a simplified Chinese version, and here for an English version.
Lecture 10: Recommender Systems
Slides (Without Answer Keys)