Been stuck on a unique NLP problem [D]

So basically, I am developing an app where I would need to classify the texts. The problem is the texts can be in English, Hindi and hindi+english(Hindi language written with English alphabets). So naturally I chose the way of sentence transformer for it but the main problem is it fails abysmally on Hindi+English. There seems to be zero semantic meaning to the model of these type of tasks. I know LLM is a solution for this but my application would be too heavy with it. I thought of transliteration but that seems to be inaccurate and corrupting the text

Is anyone else faced a similar type of issue? What direction should I take?

submitted by /u/Sadgeincomp
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top