Unlocking the Power of Natural Language Processing (NLP): Real World Applications and Controversies

Home Data Science Natural Language Processing Unlocking the Power of Natural Language Processing (NLP): Real World Applications and Controversies

Natural Language Processing (NLP) is a powerful technology that influences our everyday lives. It helps us break down language barriers with translations. Eases communication in text messages with autocorrect and predictive text. It even instantly answers our questions on website chats. However, its capabilities can be controversial, with not all applications having an entirely positive impact.

What is Natural Language Processing?

Natural Language Processing gives computers the ability to comprehend, interpret and mimic text-based human language. It consists of two subcategories:

  • Natural Language Understanding (NLU): This enables computers to comprehend text-based human language, including sentiment analysis, Named Entity Recognition (NER), and more.
  • Natural Language Generation (NLG): This empowers computers to mimic human-like sentences. It can be useful in generating summary reports for businesses and creative content.

In some cases, NLU and NLG are combined. For example, in chatbots such as ChatGPT or on a company’s website. To simulate normal human conversations, the chatbot needs to understand the user’s query, and then generate a response accordingly. 

The Controversy Surrounding NLG

While NLU can have some controversial aspects, the real controversy often surrounds NLG. For instance, a significant and recent example is the 2023 Writers Guild of America strike, lasting almost five months. One of the main reasons that the screenwriters went on strike was the use of Artificial Intelligence in the creation of scripts. They were undeniably heading in the direction of being partially or fully replaced by NLG models due to the increasing accessibility of such technologies. Additionally, copyright concerns were raised due to the large datasets used for training consisting of the writers’ creations without compensation.

There are many other ethical concerns when it comes to NLG, including the generation of fake news reports and phishing emails, as well as its role in deepfakes. Therefore, it’s important to be mindful of ethics and new laws when training and using NLP methods.

Natural Language Generation (NLG) Poetry Example

The first step in constructing an NLG model involves sourcing suitable publicly available data. It’s crucial to verify that the data’s licensing aligns with your intended use. In this case, we are using the following poetry data, which has a general public license: https://www.kaggle.com/datasets/tgdivy/poetry-foundation-poems/

After consolidating this data into a single string, we must decide whether to use characters or words as the model’s basic elements. Both of these have their advantages depending on the scenario:

  • Using characters provides more granular control over the output and allows for more creative freedom in the generation of creative writing.
  • Using words is computationally cheaper and faster, and generally considered easier to implement. It also tends to result in more human-like text.

In this case, we opted for words for quicker demonstration purposes. The poems were broken down into sequences of 30 words and each word was translated into categorical data for training. The code for this is shown below:

Text_Data = poems_string
charindex = list(set(Text_Data))
charindex.sort()
Text_Data = Text_Data.replace("\n", 'newline')
tokens = nltk.word_tokenize(Text_Data)
wordindex = list(set(tokens))
wordindex.sort()
np.save("wordindex.npy", wordindex)
word_size = len(wordindex)
seq_len = 30
x_train = []
y_train = []
for i in range(len(tokens)-seq_len): 
    X = tokens[i:i + seq_len]
    Y = tokens[i + seq_len]
    x_train.append([wordindex.index(x) for x in X])
    y_train.append(wordindex.index(Y))
x_train = np.reshape(x_train, (len(x_train), seq_len))
y_train = utils.to_categorical(y_train)

Following this, the next step is to write and train the model. An LSTM (Long Short-Term Memory) network is a fantastic choice for NLG due to its ability to retain information over long sequences:

def LSTM_model():
    model = models.Sequential()
    inp = layers.Input(shape=(seq_len, ))
    x = layers.Embedding(word_size, seq_len, trainable=False)(inp)
    x = tf.compat.v1.keras.layers.CuDNNLSTM(512, return_sequences=True,)(x)
    x = tf.compat.v1.keras.layers.CuDNNLSTM(512,)(x)
    x = layers.Dropout(0.3)(x)
    x = layers.Dense(512, activation="elu")(x)
    x = layers.Dense(256, activation="elu")(x)
    x = layers.Dropout(0.3)(x)
    outp = layers.Dense(word_size, activation='softmax')(x)
    
    model = models.Model(inputs=inp, outputs=outp)
    model.compile(loss='categorical_crossentropy',
                  optimizer=optimizers.Adam(lr=0.001),
                  metrics=['accuracy']
                 )
    return model
model = LSTM_model()

The Results

Using the word ‘the’ as a simple trigger for the model, it generated the following poem:

NLP Services Offered at Vsio Applied Analytics

Despite the controversies, Natural Language Processing opens doors to a world of possibilities. At Vsio Applied Analytics, we’re passionate about harnessing the positive potential of NLP to address your unique challenges and drive growth, and we have a wide range of bespoke services that you can benefit from:

  • Social Media Sentiment Analysis: We understand that as your business is growing, it can be difficult to keep track of what your reputation looks like on social media. To help you with this, we can analyse your social media profiles’ comments, reviews and mentions and determine your business’ sentiment scores.
  • Chatbot Solutions: Enhance customer engagement with our AI-powered chatbots. These can provide instant responses, automate tasks, and improve user experiences, making your business available 24/7.
  • Language translation: We can automatically translate content between any languages, helping you to communicate with diverse audiences.
  • Named Entity Recognition: When you have vast amounts of text-based data to analyse and you’re not sure where to start, NER can be a great solution. Our team can extract meaningful entities such as people and business names to help you kickstart your analysis.

The possibilities with NLP are boundless, and at Vsio Applied Analytics, we’re dedicated to finding bespoke solutions for your unique challenges. Contact us today for innovative NLP solutions tailored to your needs.

Erin Ward

Leave A Comment

Your email address will not be published. Required fields are marked *