Is betting just guessing (do odds reflect the true probability)?

This competition is for those who entered the Data Science Melbourne 2015 Datathon.

You will already have the data for all games upto the semi-finals and finals. The task is to use this historical data to rank order the punters on their profit for the final 3 games of the tournament (which is why we didn't give you this data).

We provide the list of Account_IDs to make predictions for, along with some limited features for the final 3 games that you may make use of.

The objective is to determine if betting is just guessing, or if past performance can be indicative of future performance. We expect this to be very hard, and will be impressed if anyone can come up with an algorithm that is better than a random number generator!

Evaluation

We are treating this as a binary classification problem - did the account make a profit or not. The evaluation metric is the AUC.

An AUC of 0.5 is random guessing and 1 is a prefect solution.

1st submission - 0.60778 (+0.00, +0.00%)

basic features
random forest
weighted profit formula

2nd submission - 0.62711 (+0.01933, +3.18%)

new features (BL ratio, cancel ratio etc.)
average profit formula

3rd submission - 0.63243 (+0.00532, +0.85%)

new features (difference between L and B)
xgboost

4th submission - 0.64118 (+0.00875, +1.384%)

new feature (invest amount)
blended models

5th submission - 0.62621/0.64088

X New benchmark (past history by game)
X Log transformation
X K-means (transactional features & customized imputation)
X Feature selection
X Multi-rounds
X New Calculation

6th submission - 0.63708

X Event Counts / Bag of Event
O Subset modeling
O Invest weigeted calculation

7th submission - 0.64421

X Meta features
- xgboost (gbm, rf)
- h2o (gbm, rf, nb, glm, dl)
- spfia (svm, glm)
- tsne cluster
- k means cluster
- fm
- knn
O New customers 0.43/-5

8th submission - 0.63971

New Feature
Meta bagged modeling
O Separate models (new/existing customers)

9th submission - 0.

Past value (cumsum)
Factorization Machines (http://www.csie.ntu.edu.tw/~r01922136/libffm/)
Regression + Classification
python lasagne

Ref

https://github.com/Gzsiceberg/kaggle-avito
entropy based features
Bad features: win_hist / DL metafeatures

Name		Name	Last commit message	Last commit date
Latest commit History 192 Commits
New features		New features
PythonScripts		PythonScripts
ReadyForBlending		ReadyForBlending
Rscripts		Rscripts
benchmark		benchmark
data		data
img		img
libffm		libffm
model		model
pred		pred
ref		ref
src		src
vowpal_wabbit		vowpal_wabbit
.DS_Store		.DS_Store
.Rhistory		.Rhistory
Model_Performance_report.xlsx		Model_Performance_report.xlsx
README.md		README.md
submission_20151202_test_bl.csv		submission_20151202_test_bl.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Is betting just guessing (do odds reflect the true probability)?

Evaluation

1st submission - 0.60778 (+0.00, +0.00%)

2nd submission - 0.62711 (+0.01933, +3.18%)

3rd submission - 0.63243 (+0.00532, +0.85%)

4th submission - 0.64118 (+0.00875, +1.384%)

5th submission - 0.62621/0.64088

6th submission - 0.63708

7th submission - 0.64421

8th submission - 0.63971

9th submission - 0.

Ref

About

Releases

Packages

Languages

ivanliu1989/Melbourne_Datathon_2015_Kaggle

Folders and files

Latest commit

History

Repository files navigation

Is betting just guessing (do odds reflect the true probability)?

Evaluation

1st submission - 0.60778 (+0.00, +0.00%)

2nd submission - 0.62711 (+0.01933, +3.18%)

3rd submission - 0.63243 (+0.00532, +0.85%)

4th submission - 0.64118 (+0.00875, +1.384%)

5th submission - 0.62621/0.64088

6th submission - 0.63708

7th submission - 0.64421

8th submission - 0.63971

9th submission - 0.

Ref

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages