WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

yangysc/Document-Classification

Repository files navigation

Document classification

Classify documents using Python based on SVM and TF-IDF.

  • Two Python librarys(Pandas and liblinear) are needed. On Windows, you can download the liblinear library from http://www.lfd.uci.edu/~gohlke/pythonlibs/#liblinear

  • The structures of the data files are:

    • The .data files are formatted "docIdx wordIdx count".
    • The .label files are simply a list of label id's.
    • The .map files map from label id's to label names.
  • This demo will give the accuracy near 81.3991% (6109/7505).

About

Classify documents using Python based on SVM and TF-IDF.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages