Definition

A stop word can be a word with meaning in a specific language, or it can be a token that does not have linguistic meaning. According to wikipedia  definition for stop words is:

In computingstop words are words which are filtered out prior to, or after, processing of natural language data (text). It is controlled by human input and not automated. There is not one definite list of stop words which all tools use, if even used.

Why remove?

The reason why these words are removed from search queries are from saving space to speeding searches. While they are pretty common and heavily repeated in conversations we (in computing) can consider sentences without them.

An example

Për një botë më të lirë dhe më të mirë. 

Key words:

Për një botë më të lirë dhe më të mirë.

If we analyze the sentence, all meaning remain to three words that are bolded. Other words are just to make sentence grammatically correct, but aren’t crucial in search. If we take just selected parts: botë lirë mirë / bote lire mire we can see that all combination would be visible / imagined / combined in searches.

Albanian stop words

There are several lists of stop words of several languages but I haven’t seen till now any list of albanian stop words. So, I decided to publish initial version of my list, but, it is supposed and exposed to be changed in future. So, I really encourage to add your comment here.

*While we have some special letters like Ç and Ë I’ll write both versions of words contain these letters.

  • a
  • apo
  • asnjë / asnje
  • ata
  • ato
  • ca
  • deri
  • dhe
  • do
  • e
  • i
  • jam
  • janë / jane
  • jemi
  • jeni
  • ju
  • juaj
  • kam
  • kaq
  • ke
  • kemi
  • kete / këtë
  • më / me
  • mu
  • në / ne
  • nëse / nese
  • një / nje
  • nuk
  • pa
  • pas
  • pasi
  • për / per
  • prej
  • që / qe
  • sa
  • së / se
  • seç / sec
  • si
  • saj
  • të / te
  • ti
  • tek
  • tij
  • tonë / tone
  • tuaj
  • ty
  • tyre
  • unë / une
  • veç / vec

I can’t publish this post without giving some credits to my valuable friend Valon Canhasi who raised the idea of having this list in our language. Thanks a lot Valon!