Bayesian and Logistic Regression Classifiers

Two classifiers are currently supported, Naive Bayes and logistic regression. The following examples use the BayesClassifier class, but the LogisticRegressionClassifier class could be substituted instead.

var natural = require('natural');
var classifier = new natural.BayesClassifier();

You can train the classifier on sample text. It will use reasonable defaults to tokenize and stem the text.

classifier.addDocument('i am long qqqq', 'buy');
classifier.addDocument('buy the q\'s', 'buy');
classifier.addDocument('short gold', 'sell');
classifier.addDocument('sell gold', 'sell');

classifier.train();

Outputs “sell”

console.log(classifier.classify('i am short silver'));

Outputs “buy”

console.log(classifier.classify('i am long copper'));

You have access to the set of matched classes and the associated value from the classifier.

Outputs:

[ { label: 'buy', value: 0.39999999999999997 },
  { label: 'sell', value: 0.19999999999999998 } ]

From this:

console.log(classifier.getClassifications('i am long copper'));

The classifier can also be trained with and can classify arrays of tokens, strings, or any mixture of the two. Arrays let you use entirely custom data with your own tokenization/stemming, if you choose to implement it.

classifier.addDocument(['sell', 'gold'], 'sell');

The training process can be monitored by subscribing to the event trainedWithDocument that’s emitted by the classifier, this event’s emitted each time a document is finished being trained against:

    classifier.events.on('trainedWithDocument', function (obj) {
       console.log(obj);
       /* {
       *   total: 23 // There are 23 total documents being trained against
       *   index: 12 // The index/number of the document that's just been trained against
       *   doc: {...} // The document that has just been indexed
       *  }
       */
    });

A classifier can also be persisted and recalled so you can reuse a training

classifier.save('classifier.json', function(err, classifier) {
    // the classifier is saved to the classifier.json file!
});

To recall from the classifier.json saved above:

natural.BayesClassifier.load('classifier.json', null, function(err, classifier) {
    console.log(classifier.classify('long SUNW'));
    console.log(classifier.classify('short SUNW'));
});

A classifier can also be serialized and deserialized like so:

var classifier = new natural.BayesClassifier();
classifier.addDocument(['sell', 'gold'], 'sell');
classifier.addDocument(['buy', 'silver'], 'buy');

// serialize
var raw = JSON.stringify(classifier);
// deserialize
var restoredClassifier = natural.BayesClassifier.restore(JSON.parse(raw));
console.log(restoredClassifier.classify('i should sell that'));

Note: if using the classifier for languages other than English you may need to pass in the stemmer to use. In fact, you can do this for any stemmer including alternate English stemmers. The default is the PorterStemmer.

const PorterStemmerRu = require('./node_modules/natural/lib/natural/stemmers/porter_stemmer_ru');
var classifier = new natural.BayesClassifier(PorterStemmerRu);