evaluate.qmd

---
title: "Evaluate"
subtitle: "Evaluate performance of the predicted model with the test data"
---

Model evaluation uses the set aside test data from the earlier splitting to evaluate how well the model predicts the response of presence or absence. Since the test response data is binary \[0,1\] and the prediction from the model is continuous \[0-1\], a threshold needs to be applied to assign to convert the continuous response to binary. This is often performed through a Receiver Operator Characteristic (**ROC**) curve (@fig-rocr), which evaluates at each threshold the **confusion matrix** (@tbl-confusion-matrix).

|          |              |               |                |
|----------|--------------|---------------|----------------|
|          |              | Predicted     |                |
|          |              | 0 (absence)   | 1 (presence)   |
| Observed | 0 (absence)  | True absence  | False presence |
|          | 1 (presence) | False absence | True presence  |

: Confusion matrix to understand predicted versus observed. {#tbl-confusion-matrix}

![ROC curve generated by showing rates of false positive vs false negative as function of changing the threshold value (rainbow colors). Source: [ROCR: visualizing classifier performance in R](https://cran.rstudio.com/web/packages/ROCR/vignettes/ROCR.html)](figures/rocr.png){#fig-rocr}

From the ROC curve, the area under the curve (**AUC**) is calculated, which is a measure of the model's ability to distinguish between presence and absence. AUC values range from 0 to 1, with 0.5 being no better than random, and 1 being perfect.

## More Resources

* [Classification: ROC Curve and AUC | Machine Learning | Google for Developers](https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc)