What is NLP? NLP Applications and Hierarchy

3 min readMar 20, 2021

In this blog post, we will talk about Natural Language Processing, Natural Language Processing Applications, and building block concepts.

What is Natural Language Processing (NLP)?

Natural Language Processing is an artificial intelligence system in which human language is processed and made understandable by computers. For computers to understand human language, it must be expressed to computers numerically. In fact, we frequently use many applications made with Natural Language Processing nowadays.

Examples of NLP Applications

Text Classification and Categorization
Summarization
Named Entity Recognition (NER)
Part-of-Speech Tagging
Semantic Parsing and Question Answering
Paraphrase Detection
Language Generation
Machine Translation
Speech Recognition
Character Recognition
Spell Checking

Objects in NLP

While building a generalized library for NLP applications one should think about which concepts are there to encapsulate as objects or building blocks of this domain. These objects are simplified building blocks that make up other blocks forming a hierarchy to work with. This makes things easier to work with. For our application domain, we can come up with 4 actors formed in a hierarchy. These are;

Corpus
Document
Sentence
Token

**Natural Language Processing Hierarchy**

Let’s examine these objects through an example.

Corpus: A website with travel guide blog posts.
Document: A blog post on the travel guide website.
Sentence: Sentences in a blog post on the travel guide website.
Token: All words, punctuation, and emojis in sentences.

Objects in SadedeGel

SadedeGel uses this structure to represent a corpus and its elements. A corpus is provided as a built-in dataset of the library. Some of these are provided publicly including TS Corpus containing more than 300K news documents in raw and tokenized format with their category classes.

Doc

To trigger the SadedeGel NLP pipeline, initialize the Doc instance with a document string. In the example below, we have defined one of the data sets in the SadedeGel library into a Doc object with load_raw_corpus. Then, we viewed the text.

Example:

Output:

Sentence

A Sentence object represents a sentence and holds a list of the Tokens in the sentence. In the example below, we have defined one of the data sets in the SadedeGel library into a Doc object with load_raw_corpus. Then we accessed the Sentence object with a built-in list function.

Example:

Output:

Token

It is expressed as a word, punctuation mark, emoji, or space Token. In the SadedeGel library, the ‘Tokens’ of a sentence can be obtained as follows.

Example:

Output:

Thanks for reading this blog post. If you want to learn more about SadedeGel, you can follow our other blog posts and visit SadedeGel Github Page.

What is NLP? NLP Applications and Hierarchy

Examples of NLP Applications

Objects in NLP

Objects in SadedeGel

Doc

Sentence

Token

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by SadedeGel

Responses (1)