Processing of the Natural Language for Information retrieval
Recovery and access to the information
Hidden models of Markov
The hidden models of Markov were developed by A. Markov in 1913 to modelizar sequences of words in Russian and at the present time they are used like statistical tool of general intention. The etiquetación becomes serious as a process doubly random parametrizable (the parameters can be considered of precise form in the training) in which the model of the language is represented by a probabilista finite robot.Two types of models:
Visible
models:
- Each
state has associate a ´unico observable process.
- The exit of the state is not random.
Hidden models:- The exit of the state is not random.
- In each state there are several
types of observations with different probabilities.
- Doubly random Model:
- Doubly random Model:
a) trasiciones between states
b) associate observations.
b) associate observations.
- One of the processes is not
observable directly
Example:
We have a series of ballot boxes in which there are balls of different colors. We do not know whichever balls each color has in each ballot box.
We have a series of ballot boxes in which there are balls of different colors. We do not know whichever balls each color has in each ballot box.
P (color
1) = b11
Motto 1
...
Motto N
P (color M) = b1M
...
P (color 1) = bN1
P (color M) = bNM
...
Motto 1
...
Motto N
P (color M) = b1M
...
P (color 1) = bN1
P (color M) = bNM
...
Ballot boxes = States
Color = Observation
We want to know as it is the more probable sequence of ballot boxes given a sequence of colors.
In order to modelizar labels in PLN:
Color = Observation
We want to know as it is the more probable sequence of ballot boxes given a sequence of colors.
In order to modelizar labels in PLN:
- States = Etiquetas (Ballot boxes)
- Observations = Words (Colors)
- Sequence of Observations = Phrases of the text
- Moments of time = Positions within the phrase
A same word (color) in different labels is possible (ballot boxes), which gives rise to ambiguities. He himself color (word) can appear more of once in each ballot box, (it labels) giving rise to different probabilities from emission of words
in each label.
Date completes update: 05 of April of 2.007
Navigation
More information