Machine Learning for Beginners in Python: How to Tag Parts Of Speech

Tag Parts Of Speech

Preliminaries


from nltk import pos_tag
from nltk import word_tokenize

Create Text Data


text_data = "Chris loved outdoor running"

Tag Parts Of Speech


text_tagged = pos_tag(word_tokenize(text_data))


text_tagged
[('Chris', 'NNP'), ('loved', 'VBD'), ('outdoor', 'RP'), ('running', 'VBG')]

Common Penn Treebank Parts Of Speech Tags

The output is a list of tuples with the word and the tag of the part of speech. NLTK uses the Penn Treebank parts for speech tags.

Tag Part Of Speech
NNP Proper noun, singular
NN Noun, singular or mass
RB Adverb
VBD Verb, past tense
VBG Verb, gerund or present participle
JJ Adjective
PRP Personal pronoun