Mujeeba Haj Najeeb asked:
first of all I need to understand why going for that at all?
and I also need to know some applications in real life to realize it more specifically
I need to know where am going, I need a general perspective to make a decision to enter in such field or not.
Thank you for your clarification; fair enough.
It's hard to cover any comprehensive applications of this kind of analysis, but, generally, it is used in some fields of linguistics, in particular,
natural language processing:
http://en.wikipedia.org/wiki/Natural_language_processing[
^].
It can be considered as one of the many parts of
computational linguistics:
http://en.wikipedia.org/wiki/Computational_linguistics[
^].
See also:
http://en.wikipedia.org/wiki/Word-sense_induction[
^],
http://en.wikipedia.org/wiki/Ambiguity[
^].
Note that in the examples mentioned above, the norm (distance) itself (I discussed the role of norm in Solution 1) is extremely complex: it should reflect
semantic similarity a very complex notion which is itself very hard to formalize. The norm values for some word set (
thesaurus) can come from extensive statistical analysis, expert systems, and the like. It has nothing to do with the string comparison algorithms implemented in most libraries.
See also:
http://en.wikipedia.org/wiki/Word-sense_disambiguation[
^].
Maybe there are different applications which I never heard of and could only speculate about. For example, one of my friends specialized in computational linguistic and defended his dissertation on such thing as inferring individual characteristics of a writer based exclusively on statistics of the words found in the text samples.
I must say that computational linguistics is a developing branch of science which is not yet really close to, say, serious commercial use. I feel that the major bread-through works lie in future, which might look attractive to the newcomers. I believe every serious work in this field is on the cutting edge of both linguistics and applied mathematics (and maybe even "fundamental" mathematics. If you want to go for it (and it's good that you asked this question), you need to have solid mathematical background and seriously go into linguistic, which is really hard to do. I don't think that being "just a programmer", a part of technical stuff, can be practical or reasonable. Getting into real science is the only thing which makes sense, but this is not for everyone, so try to be realistic. I hope you won't consider my words as discouragement. I would be more than happy to know that you take this route and are successful.
—SA