Machine Learning (2019, Spring)

# Homework 6 - Malicious Comments Identification ### Announcements ### 5/3 * train_x.csv has duplicate sentences from id 119018-119999 #### 5/2 * HW6 sample code release! * Strong baseline release! #### 4/25 * HW6 release! * HW6 raw data in ceiba! <hr> ### Links * 作業投影片 <a href="https://docs.google.com/presentation/d/1KJAYYM-n_DqOSWEwrPh52yVO1rzkJDBc8WgjLVWpB_E/edit?usp=sharing" target="_blank"></a> * Kaggle 連結 <a href="https://www.kaggle.com/c/ml2019spring-hw6/" target="_blank"></a> * Report 模板 <a href="https://docs.google.com/document/d/1bWaZfjG8g8fPe2VxzuWV8IuMRcmGztPUD7fR19iB7Cs/edit?usp=sharing" target="_blank"></a> * 遲交表單 <a href="https://docs.google.com/forms/d/e/1FAIpQLSdsspthuruApgcizvnoUW2OdBCEhJ24_LLYjXAw1gpYyQMggw/viewform" target="_blank"></a> * Sample code <a href="https://hackmd.io/s/S1YZNj4iE" target="_blank"></a> * Facebook Discussion <a href="https://www.facebook.com/groups/314613059175222/permalink/351075982195596/" target="_blank"></a>  <hr> ### Deadlines * Simple Bonus Deadline: 05/02/2019 11:59:59 (GMT+8) * Kaggle Deadline: 05/09/2019 11:59:59 (GMT+8) * Github Deadline: 05/10/2019 23:59:59 (GMT+8) <hr> ### Assignment Regulation * ALL code must be written in python3.6 * For `hw6_train.sh` and `hw6_test.sh` : * ALL python standard library is permitted (e.g. sys, csv, time) * numpy >=1.14 * pandas >= 0.24.1 * PyTorch 1.0.1, TensorFlow 1.12.0, Keras == 2.2.4 * jieba 0.39 * gensim 3.7.1 ( 只可使用 word2vec api !! ) * emoji 0.5.1 <hr> ### FAQ Q1：請問kaggle的組隊人數上限？ A1：hw6為個人作業，不用在kaggle上進行組隊。 Q2：reproduce 規則？ A2： 1. reproduce時間限制10分鐘不包含下載時間 2. reproduce誤差為±0.2% 3. reproduce標準為kaggle勾選的二擇一即可 4. baseline分數以kaggle為準 <hr>