Polish sentiment analysis


Sentiment analysis is a natural language processing (NLP) problem where text is understood and the underlying intent is predicted.

In this post, I will show you how you can predict the sentiment of Polish language texts as either positive, neutral or negative in Python using the Keras Deep Learning library.

Word2vec is a group of related models that are used to produce word embeddings. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space [1].

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Word and phrase embeddings, when used as the underlying input representation, have been shown to boost the performance in NLP tasks such as syntetic parsing and sentiment analysis.

Why sentiment analysis for Polish language is difficult?

  • syntax – relationships between words in a sentence can be often specified in several ways, which leads to different interpretations of the text,
  • semantics – the same word can have many meanings, depending on the context,
  • pragmatics – the occurrence of metaphors, tautologies, ironies, etc.
  • diactric marks such as: ą, ć, ę, ł, ń, ó, ś, ź, ż,
  • homonyms – words with the same linguistic form, but derived from words of different meaning,
  • synonyms – different words with the same or very similar meaning,
  • idioms – expressions whose meaning is different from that which should be assigned to it, taking into account its constituent parts and syntax rules,
  • more than 150k words in basic dictionary.

Models Architecture

The model is based on 1 embedding layer, 1 recurrent layer (LSTM) and Dense layer with single Dropout.


To be able to communicate with mlAPI you need to have an user account created. The best way for this to happen is to contact us via office@ermlab.com.

When you already got your credentials you can login in the following manner:

The response from the server on positive authentication should look like this:

Authorization and sample request

Every request you want to send to the server should have your TOKEN appended.

As we are using JWT for authentication and authorization the manner you should append the TOKEN is as follows:


Valid response