language model

A language model is a probability distribution over sequences of words. Given any sequence of words of length m, a language model assigns a probability

P
(

w

1

,
…
,

w

m

)

{\displaystyle P(w_{1},\ldots ,w_{m})}
to the whole sequence. Language models generate probabilities by training on text corpora in one or many languages. Given that languages can be used to express an infinite variety of valid sentences (the property of digital infinity), language modeling faces the problem of assigning non-zero probabilities to linguistically valid sequences that may never be encountered in the training data. Several modelling approaches have been designed to surmount this problem, such as applying the Markov assumption or using neural architectures such as recurrent neural networks or transformers.
Language models are useful for a variety of problems in computational linguistics; from initial applications in speech recognition to ensure nonsensical (i.e. low-probability) word sequences are not predicted, to wider use in machine translation (e.g. scoring candidate translations), natural language generation (generating more human-like text), part-of-speech tagging, parsing, optical character recognition, handwriting recognition, grammar induction, information retrieval, and other applications.
Language models are used in information retrieval in the query likelihood model. There, a separate language model is associated with each document in a collection. Documents are ranked based on the probability of the query

Q

{\displaystyle Q}
in the document's language model

M

d

{\displaystyle M_{d}}
:

P
(
Q
∣

M

d

)

{\displaystyle P(Q\mid M_{d})}
. Commonly, the unigram language model is used for this purpose.
Since 2018, large language models (LLMs) consisting of deep neural networks with billions of trainable parameters, trained on massive datasets of unlabelled text, have demonstrated impressive results on a wide variety of natural language processing tasks. This development has led to a shift in research focus toward the use of general-purpose LLMs.

You do not have permission to view the full content of this post. Log in or register now.

Search

Search

language model

(100 Off Udemy) Learn Google Bard: The AI Language Model That Can Do It All

Course FREE LIMITED Udemy Course (Learn Google Bard: The AI Language Model That Can Do It All)

Magazine Introduction to ChatGPT An Overview of OpenAI's Cutting-Edge Language Model