Linguistics
Ricardo Lezama  

Linguistics In The Enterprise

Why Linguistics (And Linguists) Are Always On The Back-Foot In An Enterprise Context

Linguistics is often questioned by practitioners of Natural Sciences during informal and professional scenarios. Perhaps, this is the case due to the fact that the phenomena relevant to the Natural Sciences is more readily observable through instrumental means. The irony is that everyone can indeed perceive language, but, unfortunately, this does not necessarily make them experts in the matter.

For instance, a casual observer of a language (an individual who casually has become acquainted with some morsel of data about a language) may define Linguistics as just ‘adding an ‘-s’ suffix to the end of a word to make it plural – that’s not the discipline, at all.

Better Direction For Linguists In Industry

The scientific study of language tries to uncover scientifically sound generalizations about natural language. Over time, different technical devices (formalisms) have been developed to describe language. Some methodologies, like Lexical Functional Grammar, have been useful and easy to transfer into software technologies.

However, even with the aforementioned successes, the priorities for linguists in an enterprise are often misaligned. Annotation for sentiment, for instance, may be better handled by psychology experts. While archiving data is useful, that task is best left to librarians with a strong intuition about how language behaves in the real world vis a vis an Information Retrieval system. Linguists need to spell out their formidable intuitions in code to better exploit the above recommendation. Linguists need to manipulate and train the Language Models, not create the annotations for anything non-linguistic.

Trivia In The Office – Bad Perception

Figuring out when some rule from Oxford applies in a romance novel is fine trivia, but it is not ‘Linguistics’. To begin with, the scientific practice around explaining language behavior is very broad and interdisciplinary. We should not permit the discipline to become describing the inscriptions literally.

When Language Descriptions Meet Computation

While data and rules about languages are important, memorizing data feels like a somewhat pointless exercise. This is a sign that the field in some corners is overly defined by linguistic trivia about English – or some other pet language – rather than in terms of reproducible general principles that can be easily computed. This last point is what really drives progress in Computational Linguistics, and it can be mathematical or statistical in nature. For instance, spellcheck modules depend on anticipating the most likely candidate for a given sequence of neighboring N-Grams.

Linguistics Takes Time

Currently, my fear is that as Linguistics in the enterprise context gains strength, the finer points around rationale will be overtaken by boring data recitals. If so, we are in for a world of trivia and not developments largely as a result from the influence of non-linguist priorities in the discipline. The drive to subvert some computational tools for linguistic ends does not exist.

Narrative is very important. Understanding why a linguistic analysis exists helps ground activities. This ability to contribute to NLP activities through adopting a proper narrative for linguistic activities in the enterprise setting has not surfaced beyond the ‘we need better data for this hungry machine learning algorithm’. It pays the bills, but it does not advance the field.