Lemmatization
Overview
Time: minObjectives
What is Lemmatization?
It is a process of converting a word to its base form. In other words, it is the same as stemming.
However, the main difference between stemming and lemmatization is that lemmatization considers the context and converts the word to its meaningful base form, whereas stemming just removes the last few characters, often leading to incorrect meanings and spelling errors.
For example - The word caring, stemming reduces it to “car”, where are lemmatization reduces it to “care” which is similar to the actual word.
Lemmatization can be implemented in python by using Wordnet Lemmatizer, Spacy Lemmatizer, TextBlob, Stanford CoreNLP
nltk.download('wordnet')
nltk.download('omw-1.4')
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
print("Caring :", lemmatizer.lemmatize("Caring"))
print("Caring :" +lst.stem("Caring"))
print("corpora :", lemmatizer.lemmatize("corpora"))
Key Points