Probabilistic Outputs for Support Vector Machines

This chapter covers results in the paper Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods available at this Citeseerx link 1999 version.

Lecture

The lecture slides are available here.

Overview of Platt Scaling

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The standard SVM output produces probabilities. This is - [ ] True. - [x] False. > Correct. - [ ] Unknown. ### Platt Scaling 1. [x] trains an SVM and then trains the parameters of a sigmoid that maps the SVM outputs into probabilities. > Correct. 1. [ ] trains an SVM and the parameters of a sigmoid that maps the SVM outputs into probabilities simultaneously. 1. [ ] trains only an SVM.

Introduction

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### Standard Support Vector Machines (SVMs) produce - [ ] a classification together with a probability. - [x] an uncalibrated value that is not a probability. > Correct. - [ ] only a probability estimate.

Related Work

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The posterior estimate derived from the two-Gaussian approximation is - [ ] always monotonic. - [x] non-monotonic. > Correct. - [ ] zero.

Approach

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### Platt Scaling uses training data to - [ ] fit the single parameters T to the sigmoid model. - [x] fit parameters A and B of the sigmoid model. > Correct. - [ ] fit n parameters S1, S2, ..., Sn to a Gaussian model.

Results and Conclusions

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The proposed Platt scaling or SVM + sigmoid approach - [ ] does not preserve the sparseness of kernels. - [x] preserves the sparseness of kernels. > Correct. - [ ] can not be applied to kernel-based methods.

Code and Assignment

There are no programming assignments for this lecture.