Welcome to MSMK7025: Algorithms, Big Data and Online Marketplaces!
Class A: Mon & Thu 09:30 - 12:30 (B4) Class Days: Dec 4, 7, 11, 14, 18, 21, Jan 4, 8, 11, 15
Class B: Mon & Thu 14:00 - 17:00 (B4) Class Days: Dec 4, 7, 11, 14, 18, 21, Jan 4, 8, 11, 15
Class C: Wed 9:30 - 12:30 and 14:00 - 17:00 (B4) Class Days: Dec 6, 13, 20, Jan 10, 17
Instructor: Xi Li
- You can download the software for the course here: R, R Studio and Tableau Public
- Note: When installing R/Rstudio, make sure your path does not contain any non-english letters. (安装路径必须为纯英文,否则运行会出错)
- If you have issues with the laptop version of R/RStudio, you can try the cloud platform here.
- No textbooks for the course.
- The order of presentation: Class A Class B Class C Sample presentation slides
Lecture 1: Introduction
Slides (With Answer Keys)
Lecture 2: Introduction to R
Note: Please get R and Tableau Public installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: A beginner’s guide to R
Troubleshooting for package installation: Here
Additional Tips for summarizing data (in Chinese): link Thanks our former student Sheng for sharing it!
Lecture 3: Data Visualization with Tableau
Note: Please get R and Tableau Public installed on your laptop and bring it with you.
Slides
Data files: Superstore Data, Movie Data Tableau Public Gallery
Example of Word Cloud: Word Cloud WordArt for generating Word Cloud: WordArt
Templates for Word Cloud: Lion Trump
Lecture 4: Beyond Linear Regression
Slides (With Answer Keys)
Lecture 5: Data Workshop
Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Dataset: CSV
Sample Data Analysis with R: Sample
Lecture 6: Introduction to Data Scraping
Note: Please get R and Chrome browser installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: Scraper (HKU)
Scraper (Photo)
Lecture 7: Text Analysis
Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Codes: word frequency and count, text analysis, topic models
A Chinese sentiment lexicon here
Demonstration of LDA. Data files: document and stopword
Lecture 8: Data Workshop II
Note: Please get R installed on your laptop and bring it with you.
Slides (With Answer Keys)
Teaching materials: Quadratic Regression, Interactions and Fixed Effects
Review Dataset: XLSX, CSV
Sample Data Analysis with R for online reviews: Sample
Property Dataset: Please access the online platform here
Note: If you see an error message, please try another browser.
Sample code for data analysis
General Information about property valuation in HK
User Manual for the online platform
Variable List
Lecture 9: Causality
Slides (With Answer Keys)
Lecture 10: Personalization and Recommendation
Slides (With Answer Keys)
Code: Collaborative Filtering