"For these algorithms to work, we need to select some feature..." — Jasneet Sabharwal

“For these algorithms to work, we need to select some features from text. Some of the commonly used features for NER are unigram, left and right bigrams and trigrams, part of speech tags, whether the word is capitalized or not, is the first character capitalized or not, whether the word is surrounded by quotes or not, is there a hyphen in the word, is the word present in our gazetteer list (it's a list which would contain names of people, organizations, locations, etc. mined from various sources like Wikipedia and Freebase), word suffixes and prefixes, .”

— Jasneet Sabharwal, How can I find city, country, company name from a tweet text using Java?

NER NLP