Welcome to MSMK7025: Algorithms, Big Data and Online Marketplaces!

Class A: Mon & Thu 2:00 - 5:00pm (CP-EFG) Class Days: Dec 5, 8, 12, 15, 19, Jan 5, 9, 12, 16, 19
Class B: Mon & Thu 6:30 - 9:30pm (CP-EFG) Class Days: Dec 5, 8, 12, 15, 19, Jan 5, 9, 12, 16, 19
Class C: Tue & Fri 6:30 - 9:30pm (CP-ABC) Class Days: Dec 6, 9, 13, 16, 20, Jan 3, 6 (Classroom: LTB), 10, 13, 17

Instructor: Xi Li

  • You can download the software for the course here: R, R Studio and Tableau Public
  • Note: When installing R/Rstudio, make sure your path does not contain any non-english letters. (安装路径必须为纯英文,否则运行会出错)
  • No textbooks for the course.

Grading:
10%: Group Paper Presentation (~12 minutes, in class). Details
Selecting your projects here: Class A Class B Class C
40%: Two Group Data Projects (no presentation)
Deadline for project I: Class A (Dec 29, 5:00 pm), Class B (Dec 29, 9:30pm), and Class C (Dec 30, 9:30pm)
Deadline for Project II: Class A (Jan 19, 5:00 pm), Class B (Jan 19, 9:30pm), and Class C (Jan 17, 9:30pm)
20%: In-class Participation
30%: Individual Assignment (no presentation)
Deadline for Individual Assignment: Class A (Jan 19, 5:00 pm), Class B (Jan 19, 9:30pm), and Class C (Jan 17, 9:30pm)

QR code for live comments
QR code for live comments

Lecture 1: Introduction

Slides (With Answer Keys)

Lecture 2: Introduction to R

Note: Please get R and Tableau Public installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: A beginner’s guide to R
Troubleshooting for package installation: Here
Additional Tips for summarizing data (in Chinese): link Thanks Sheng for sharing it!

Lecture 3: Data Visualization with Tableau

Note: Please get R and Tableau Public installed on your laptop and bring it with you.
Slides
Data files: Superstore Data, Movie Data Tableau Public Gallery
Example of Word Cloud: Word Cloud Tagul for generating Word Cloud: Tagul
Templates for Word Cloud: Lion Trump

Lecture 4: Beyond Linear Regression

Slides (With Answer Keys)

Lecture 5: Data Workshop I

Slides (With Answer Keys)
Dataset: CSV
Sample Data Analysis with R: Sample

Lecture 6: Data Scraping

Slides (With Answer Keys)
Teaching materials: Scraper (HKU), Scaper (Marketing Science Journal), Scraper (Harvard), Scraper (Photo)

Lecture 7: Data Workshop II

Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: Quadratic Regression, Interactions and Fixed Effects

Review Dataset: XLSX, CSV
Sample Data Analysis with R for online reviews: Sample

Property Dataset: Please access the online platform here
Sample code for data analysis
General Information about property valuation in HK
User Manual for the online platform
Variable List

Lecture 8: Text Analysis

Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Codes: text analysis, topic models
A Chinese sentiment lexicon here
Demonstration of sentiment analysis.
Demonstration of LDA. Data files: document and stopword

Lecture 9: Causality

Slides (With Answer Keys)
Optional reading: a discussion of some famous instruments (in simplified Chinese) here
Optional reading: the 2021 Nobel Prize in Economics here for a simplified Chinese version, and here for an English version.

Lecture 10: Recommender Systems

Slides (Without Answer Keys)