Hi Deepak,
My two cents:
1. Natural language processing per document, in this case a single tweet, can be performed in premise of one language...so fo example it will be hard to processing same tweet for German and English.
2. However we could do for overlapping/hybrid language like Hinglish...this just means that there are some words cryptic to grammer of main language...example Haan this is right.
3. So building on this there are two approaches
approach 1: Synonyms: Use HANA dictionaries object to fill in these new words to and map them to neutral native language word...example Haan to Yes this in principle helps resolve already quite some text analysis
approach 2: This infact will use the output of approach 1 in something called as rule files..the complied version of it are .fsm....uncomplied are .rul in this rul file then you write rules for whatever analysis you want
Example:
Bohot Bura
this from approach one normalizes to
Very Bad
and rule would say if there is a neutral word or its synonym for "VERY" before neutral word or its synonym "BAD" then sentiment is strong negative
Hope this helps ![]()
Cheers,
Jemin