natural language processing

Natural Language Processing Vocab:

  • Natural language processing (NLP)
  • Natural language
  • Artificial intelligence
  • Linguistics
  • High level programming languages
  • Syntax
  • Phrase structure rules
  • Voice recognition
  • Semantic
  • Knowledge graph
  • Spam filter
  • Sentiment analysis
  • Chatbot
  • Predictive text

Introduction:

As humans living in the age of technology, we interact with computers every single day. However, us humans and computers communicate in essentially different ways. Natural Language Processing (NLP) is a way for computers to analyze, understand, and derive meaning from human language in a smart and useful way. Human languages are natural – there is relatively less structure and more room for ambiguity. However, computer languages are rigidly structured, with strict formatting and limitations. As the need for humans to communicate with computers increases with each new day, bridging that language and communication gap is become increasingly important. Thus came about natural language processing (NLP), a field which combines artificial intelligence, computer science, and linguistics (the scientific study of language, and involves an analysis of language form, language meaning, and language in context) in order to help machines comprehend language like us humans do.

What NLP is and why it is important

Natural language processing – NLP is a field of artificial intelligence (giving computers intelligence to perform tasks that normally require human intelligence) that combines computer science and linguistics (the study of language). The purpose of NLP is to give computers the ability to understand and interpret human language. You may wonder, can’t computers already do this? Don’t they understand our high level programming languages (computer coding languages that are closer to human languages)?

  • Yes, but even through high level programming languages, we still have to type it in a certain way, using only certain words (called syntax), and in a certain format. There is a rigid structure and lots of limitations that come with any computer programming language. It is much more challenging for a computer to understand natural language (human language) such as English, French, or Spanish.

Why is NLP a challenge?

  • It should be noted that NLP is a quite challenging field in technology. It may sound odd at first, because language is a part of our lives that comes so easily and naturally to us. However, the presence of great ambiguities within languages makes it very difficult to teach machines how to understand and interpret them. Some of these ambiguities include words with many meanings, speakers with different accents, slurring words, ambiguity, and mispronunciation. Something to note though, is that humans are able to still understand when humans make those mistakes. But computers can’t. NLP essentially aims to change this and give computers the amazing ability to comprehend natural language.

Key ideas and concepts of NLP

Phrase structure rules

If you were given the challenge of teaching the rules of general language to someone who does not have any idea of them, it is safe to say that you would start with the parts of speech. Take english as an example – the most fundamental parts of speech are nouns, verbs, adverbs, adjectives, pronouns, conjunctions, interjections, prepositions, and articles. This is how those who worked in NLP used to approach teaching computers language. However, there is a big issue with this method: computers do not just know all of the words and parts of speech of a language because words have multiple meanings, For example, the word “bark” is a noun that means the outside layer of a tree. However, “bark” is also a verb that means a dog making a noise.

This potential confusion led to the creation of phrase structure rules, which are rules that capture the structure of a phrase in a language. Here is an example of a phrase structure rule in english:

  • A sentence could be made up of a noun phrase and a verb phrase:
    • Noun phrase – article + noun OR adjective + noun OR just noun
    • Verb phrase – just verb OR verb + noun phrase

You can create rules like this for a whole language! What makes phrase structure rules very helpful is that they give every word in the language a part of speech, and they show how a sentence is constructed. This leads to much more convenience when considering applications of NLP.

  • Example – using voice recognition: when you ask a question such as “when was the first computer created?” the computer recognizes the keyword “when” and realizes that it is a when question and that it should give a response with a date or time period.

Phrase structure rules and other methods to break down and represent language structure can be used by computers to also generate natural language text. This works well when the data is stored in a web of semantic (self describing) information, where the components are linked in meaningful relationships so informational sentences can be conveniently crafted.

Knowledge graphs

Google’s knowledge graph

When you enter a search in Google, Google first looks at your search keywords. Then, using its knowledge graph, Google displays results according to things that were linked to your keywords. Google’s knowledge graph is Google’s way of connecting information in a visual way; its purposes are to improve search relevancy and present search results that provide direct answers – or at least closely related ones – to the search. Every time that you search something up using google and you view the results, that’s the knowledge graph and NLP in action!

Common uses and importance of NLP

Spam filter

If you have an email account, that means you’ve experienced the wonderful benefits of a spam filter. A spam filter is an email service that is designed to block spam (unwanted content) from a user’s inbox, keeping their inbox clean and well organized. Spam filters analyze a number of details of an email before deeming it spam or non-spam; however, the most important part that the filter looks at is the content of the email. An effective way for the filter to decide whether an email is spam or not is by comparing certain words and phrases of the email to words and phrases that it has observed as common among several known spam emails. It essentially analyzes the natural language of the email to decide whether it is inbox-worthy or spam-worthy.

Sentiment analysis

As we already know by now, human communication is not straightforward. There are a lot of nuances (subtleties) and ambiguities. However, us humans have a lifetime experience of reading and sensing these subtleties in human communication. For example, you can tell if your friends are happy or sad when they are talking to you based on their word choice. Where it gets interesting though, is that with NLP, computers can be given to ability to analyze sentiment (or feelings) through word choice as well! In sentiment analysis, computers are first given tons of words – each with the label “positive” or “negative”, depending on which feeling they represent. Once computers have been shown enough examples, they can then start to spot and analyze feelings in words by themselves in more general contexts. Sentiment analysis is super useful in many ways. For example, let’s say you are a content creator on social media and you want to create and share more of what your audience wants to see. Using NLP, you can analyze your audience’s thoughts on your content through their comments and reviews (“I loved this” or “Didn’t really enjoy this…”). This way, you can create more of what they want to see accordingly, leaving you with a more satisfied audience, better reviews, and hopefully more happiness and profits, and your audience with a better experience.

Chatbots

Have you ever gone onto a company’s website and been instantly greeted by someone through a chat box? I used to be astonished at how promptly they greeted me (just seconds after I’d arrived on the website)! The chat box would ask me if I needed any assistance and if I entered a sentence, it would give me the exact help that I needed. I knew that these websites had a huge amount of visitors, so I wondered just how the employees could respond so quickly and know so much that they could answer nearly any questions that I had about their services or products. I soon came to realize that it wasn’t a person who was chatting with me; it was a chatbot. A chatbot is a computer program or artificial intelligence that conducts conversation through audio or text. It is also a prime example of NLP in action. Think about it – when a chatbot asks you, “how may I assist you?” and you respond, it first picks apart your words, finds keywords, analyzes them, and then returns a result that is associated with your keywords in order to have the highest chance of successfully helping you out. This is a key method of using NLP in marketing to increase convenience for customers (and employees), as well as to increase satisfaction.

Voice recognition

Voice recognition is becoming increasingly popular in the world of NLP. In fact, most of you have probably experienced the power of voice recognition first hand if you have ever used Siri! Here’s a fun fact about Siri: Siri actually stands for Speech Interpretation and Recognition Interface (which practically shows how it is NLP). Siri is able to analyze your voice to understand it by following these general steps:

First, Siri converts what you say into a data file (taking into consideration your accent and nuances). Next, Siri sends your voice data file to the Apple server to be processed. The server has a lot of responses to go with the majority of the questions that you probably ask. However, if you throw it a curveball that it is not able to understand, it responds with the standard “Would you like to search the web for that?”. Siri also tries to understand the tone of your question or command, making it pretty intuitive. For example, try saying “I am sad” and “I am happy” and see what responses Siri comes up with for both.

Faster and more accurate typing

Predictive text

Predictive text is used in text messaging and it suggests words based on letters or phrases that are being entered to fit in the overall context of the message. This is super efficient and useful because it saves users time and keystrokes. This is a popular example of NLP at work. In order to accurately suggest letters or phrases to complete a user’s message, predictive text looks into previous texts that the user has sent with the same first couple of letters as well as general data about how users complete phrases with those first few letters. The predictive text technology then chooses the completing phrase that it thinks would be most appropriate in the context of the user’s message.

A really popular way that this technology is used is in Google! Think about it, when was the last time you fully typed in a Google search without having the search engine suggest the right way that you wanted it completed? Try it out right now!

Accurate typing

NLP also helps users type more accurately with spell check. Our “natural language” typing may sometimes be full of errors, but spell check (which is NLP based) is here to save the day! When a user types a word incorrectly, the spell check first detects that it is incorrect, proceeds to compare it with similar looking/sounding words from its large dictionary, and then suggests a correction.

Wrap up:

Outline of the topics touched upon in this article:

  • What NLP is and why it is important
    • Why NLP is a challenging field
  • Key ideas and concepts of NLP
    • Phrase structure rules
    • Knowledge graphs
  • Common uses and importance of NLP
    • Spam filter
    • Sentiment analysis
    • Chatbots
    • Voice recognition
    • Faster and more accurate typing
      • Predictive typing
      • Spell check

Fascinated with the topic of natural language processing? (I am too). Read more below if you are interested!

Natural Language Processing Vocab Defined:

  • Natural language processing (NLP) – a field of artificial intelligence that combines computer science and linguistics whose purpose is to give computers the ability to understand and interpret human language
  • Natural language – human language; contains less structure and more ambiguity
  • Artificial intelligence – giving computers intelligence to perform tasks that normally require human intelligence
  • Linguistics – the scientific study of language, and involves an analysis of language form, language meaning, and language in context
  • High level programming languages – computer coding languages that are closer to human languages
  • Syntax – the arrangement of words and phrases to create well-formed sentences in a language
  • Phrase structure rules – rules that capture the structure of a phrase in a language
  • Voice recognition – the ability of a computer or other machine to understand spoken instructions or to recognize human voices
  • Semantic – having to do with the meaning of a word
  • Knowledge graph – Google’s way of connecting information in a visual way
  • Spam filter – is an email service feature designed to block spam (unwanted content) from a user’s inbox
  • Sentiment analysis – the process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer’s attitude towards a particular topic, product, etc., is positive, negative, or neutral.
  • Chatbot – a computer program designed to simulate conversation with human users
  • Predictive text – technology that facilitates typing on a mobile device by suggesting words the end user may wish to insert in a text field; predictions are based on the context of other words in the message and the first letters typed

Quiz: