This lesson is still being designed and assembled (Pre-Alpha version)

Stemming

Overview

Time: min
Objectives

What is Stemming?

Stemming is the process of reducing tokens to root forms. For example - studying and studied are converted to study. There are two commonly used stemming techniques in python.

Among these, Lancaster stemming is more aggresive, with twice the rules as porter stemmer and tends to over stem words -

Porter Stemming

from nltk.stem import PorterStemmer
pst = PorterStemmer()
stm = ["giving", "given", "given", "gave"]
for word in stm :
   print(word+ ":" +pst.stem(word))

Lancaster Stemming

from nltk.stem import LancasterStemmer
lst = LancasterStemmer()
stm = ["giving", "given", "given", "gave"]
for word in stm :
 print(word+ ":" +lst.stem(word))

Key Points