NESVM: a Fast Gradient Method for Support Vector Machines

This is a work we finished one year ago and is accepted as a regular paper in ICDM 2010. NESVM is a fast SVM algorithm for primal linear and kernel SVM problems, with the optimal convergence rate of first-order method and simple computation in each iteration. The PDF of the paper can be downloaded here:

NESVM: a Fast Gradient Method for Support Vector Machines

You can also find it on arXiv. SVM is a fundamental classification tool for various machine learning tasks. However, its efficiency is not satisfying for large-scale classification problems. There exist several appealing SVM solvers, e.g., LIBSVM, SVM-Light, SVM-Perf, Pegasos. Most of them focus on solving dual SVM or solving primal SVM in an online way.

Compared with existing SVM solvers, NESVM is competitive in three aspects:

  • NESVM is a fast gradient method with the optimal convergence rate at \mathcal O(1/k^2) and linear time complexity. It is based on Nesterov’s method proposed in paper Smooth minimization of non-smooth functions. Only two matrix-vector multiplications are required in each iteration round.
  • NESVM directly tackles the original primal problem of batch SVM and its implementation is fairly simple, thus it can be easily extended to and embedded in many other complex classification tasks.
  • NESVM proposes novel smooth and differentiable alternatives for hinge loss and \ell_1 norm, which are commonly used in machine learning and difficult to optimize because they are not differentiable everywhere.

We compared the performance of NESVM with other popular SVM solvers on several datasets, here is an example of event recognition in computer vision:

Time costs vs. constant C in event recognition

I will post the code of NESVM in a few days.


About tianyizhou

Research Assistant at University Washington, Seattle, Working on Machine Learning & Statistics in MODE lab leaded by Prof. Carlos Guestrin, and MELODI lab leaded by Prof. Jeff Bilmes.
This entry was posted in Tianyi's work and tagged , , , , . Bookmark the permalink.

2 Responses to NESVM: a Fast Gradient Method for Support Vector Machines

  1. GR says:

    Is this limited to the C-SVM formulation? Do you have any plans to offer a nu-SVM alternative, as in e.g. LibSVM?

    • tianyizhou says:

      I think NESVM is able to be conveniently extended to most versions of SVMs, if they have their own primal optimization forms. You can find the extensions to LS-SVM and LP-SVM in Section 3 of the NESVM paper.

      For nu-SVM (primal form), its difference comparing with C-SVM are that an extra new variable $\rho$ and a new parameter $\nu$ are added in, while $C$ is replaced by $1/m$. What we have to do is to replace the $1$ in hinge loss of C-SVM with $\rho$, replace this modified hinge loss with the slack variables $\xi$ in the object function, then use the method of NESVM to solve the optimization problem. The new variable $\rho$ and $w$ can be written as a single vector variable in the optimization.

      I believe the nu-SVM will be included in our future work of NESVM, especially in the first MatLab version, which will be published after ICDM. Thank you very much for your advice!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s