🎩
wilmerags 🌱
  • Home
  • Social
  • Cloud
    • Aws
  • Stats
  • Code
    • Scrum
    • Ssh
    • Vim
    • Dvc
    • Postgresql
    • Tmux
    • Terraform
    • Web tools
    • Sql
    • Rest api
    • Mongo
    • Docker
    • Octave
    • Elasticsearch
    • Kubernetes
    • Bash
    • Rabbitmq
    • Databases
      • Mongo
      • Elasticsearch
      • Sql
        • Postgresql
    • Devops
      • Terraform
      • Docker
      • Kubernetes
      • Rabbitmq
    • Python
      • Airflow
      • Keras
      • Spark
      • Azure
      • Matplotlib
      • Jupyter
      • Numpy
      • Databases
      • Sklearn
      • Requests
      • Pandas
      • Elasticsearch
      • Tensorflow
    • Git
      • Gitflow
    • R
      • Lpsolve
  • Indie-hacker
  • Macos
  • Interesting
  • Thoughts
    • Health
    • Work
    • Relationships
    • On the need of expressiveness
    • On organizing knowledge
    • On the importance of questions
  • Linux
    • Vim
    • Tmux
  • Webdev
    • Vue
  • Readings
    • Psychology
    • Habits
    • Projects management
    • Quotes
    • Dopamine detox
  • Ai
    • Ml
      • Xgboost
      • Performance evaluation
      • Community detection
      • Cloud_platforms
        • Ai platform
        • Sagemaker
      • Unsupervised_learning
    • Nlp
    • DS
Powered by GitBook
On this page
  1. Ai
  2. Ml

Performance evaluation

Frecuently used techniques for performance evaluation

PreviousXgboostNextCommunity detection

Last updated 4 years ago

Was this helpful?

CtrlK
  • Metrics
  • References:

Was this helpful?

Metrics

Gini

  • Definition: Indicates how discriminative is the model (predictive power)

  • Possible values:

    • 0 would indicate no discrimination based features to make a choice.

    • 1 would indicate the model is completely relying on features as discriminators to make a choice (desirable for banks for example)

KS statistic

  • Definition: Indicates the maximum distance between distribution functions (classes samples in supervised learning) of two samples (empirical, or one reference).

  • Possible values:

    • 0 would indicate no distinction on the two samples (no difference in label A-label B scores).

    • 1 would indicate Maximum distance between the two samples.

Student's t-test

  • Definition: Indicates how likely a set of samples came from the exact same distribution (P-value). P-Value can be compared with a threshold call statistical significance (e.g. .05).

  • Possible values:

    • P-Value < 0.05, indicate we can reject the null hypothesis that the two samples are coming from the exact same distribution.

  • Comment: Samples must be shaped in a normal distribution.

K fold cross validation:

  • Definition: Indicates the variation of performance by sampling different same sized subsets of the data of K fold and training with the remaining of the subsets.

  • Possible values:

    • K between 5 and 10 folds.

  • Comment: An stratified version of it is recommended given that it tries to hold the same proportions of the whole dataset.

Cross entropy vs sparse cross entropy If your Y_i's are one-hot encoded, use categorical_crossentropy. Examples (for a 3-class classification): [1,0,0] , [0,1,0], [0,0,1]

But if your Y_i's are integers, use sparse_categorical_crossentropy. Examples for above 3-class classification problem: [1] , [2], [3]. Might be more memory efficient

References:

  • https://www.statology.org/k-fold-cross-validation/