樓主: Nicolle
1041 1

[Case Study]Hidden Markov Model Using Apache Mahout [分享]

版主

巨擘

0%

還不是VIP/貴賓

-

TA的文庫  其他...

Python Programming

SAS Programming

Must-Read Books

威望
16
論壇币
12292652 个
通用積分
262.2717
學術水平
3071 点
熱心指數
3066 点
信用等級
2862 点
經驗
453053 点
帖子
21118
精華
92
在線時間
8072 小时
注冊时间
2005-4-23
最后登錄
2019-11-15

Nicolle 学生认证  发表于 2015-6-22 01:44:50 |顯示全部樓層

Hidden Markov Model Using Apache Mahout


Apache Mahout has the implementation of the Hidden Markov Model. It is available in the org.apache.mahout.classifier.sequencelearning.hmm package.

The overall implementation is provided by eight different classes:

  • HMMModel: This is the main class that defines the Hidden Markov Model.
  • HmmTrainer: This class has algorithms that are used to train the Hidden Markov Model. The main algorithms are supervised learning, unsupervised learning, and unsupervised Baum-Welch.
  • HmmEvaluator: This class provides different methods to evaluate an HMM model. The following use cases are covered in this class:
    • Generating a sequence of output states from a model (prediction)
    • Computing the likelihood that a given model will generate the given sequence of output states (model likelihood)
    • Computing the most likely hidden sequence for a given model and a given observed sequence (decoding)
  • HmmAlgorithms: This class contains implementations of the three major HMM algorithms: forward, backward, and Viterbi.
  • HmmUtils: This is a utility class and provides methods to handle HMM model objects.
  • RandomSequenceGenerator: This is a command-line tool to generate a sequence by the given HMM.
  • BaumWelchTrainer: This is the class to train HMM from the console.
  • ViterbiEvaluator: This is also a command-line tool for Viterbi evaluation.

Now, let's work with Bob's example.

The following is the given matrix and the initial probability vector:

[td]

Ice cream

Cake

Juice

0.36

0.51

0.13


The following will be the state transition matrix:

[td]

Ice cream

Cake

Juice

Ice cream

0.365

0.500

0.135

Cake

0.250

0.125

0.625

Juice

0.365

0.265

0.370


The following will be the emission matrix:

[td]

Spicy food

Normal food

No food

Ice cream

0.1

0.2

0.7

Cake

0.5

0.25

0.25

Juice

0.80

0.10

0.10


Now we will execute a command-line-based example of this problem. We have three hidden states of what Bob's eaten for snacks: ice-cream, cake, or juice. Then, we have three observable states of what he is having at lunch: spicy food, normal food, or no food at all. Now, the following are the steps to execute from the command line:

  • Create a directory with the name hmm: mkdir /tmp/hmm. Go to this directory and create the sample input file of the observed states. This will include a sequence of Bob's lunch habit: spicy food (state 0), normal food (state 1), and no food (state 2). Execute the following command:
  1. echo "0 1 2 2 2 1 1 0 0 2 1 2 1 1 1 1 2 2 2 0 0 0 0 0 0 2 2 2 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 0 2 1 2 0 2 1 2 1 1 0 0 0 1 0 1 0 2 1 2 1 2 1 2 1 1 0 0 2 2 0 2 1 1 0" > hmm-input
複制代碼
  • Run the BaumWelch algorithm to train the model using the following command:
  1. mahout baumwelch -i /tmp/hmm/hmm-input -o /tmp/hmm/hmm-model -nh 3 -no 3 -e .0001 -m 1000
複制代碼
The parameters used in the preceding command are as follows:
  • i: This is the input file location
  • o: This is the output location for the model
  • nh: This is the number of hidden states. In our example, it is three (ice cream, juice, or cake)
  • no: This is the number of observable states. In our example, it is three (spicy, normal, or no food)
  • e: This is the epsilon number. This is the convergence threshold value
  • m: This is the maximum iteration number

The following screenshot shows the output on executing the previous command:


  • Run the model to predict the next 15 states of the observable sequence using the following command:
  1. mahout hmmpredict -m /tmp/hmm/hmm-model -o /tmp/hmm/hmm-predictions -l 10
複制代碼

The parameters used in the preceding command are as follows:


  • m: This is the path for the HMM model
  • o: This is the output directory path
  • l: This is the length of the generated sequence

  • View the prediction for the next 10 observable states.
  1. mahout hmmpredict -m /tmp/hmm/hmm-model -o /tmp/hmm/hmm-predictions -l 10
複制代碼

The output of the previous command is shown in the following screenshot:


From the output, we can say that the next observable states for Bob's lunch will be spicy, spicy, spicy, normal, normal, no food, no food, no food, no food, and no food.

  • Predict the hidden states for a given observational state's sequence using the Viterbi algorithm.
  1. echo "0 1 2 0 2 1 1 0 0 1 1 2" > /tmp/hmm/hmm-viterbi-input
複制代碼
  • Generate the output with the likelihood of generating this sequence
  1. mahout viterbi --input /tmp/hmm/hmm-viterbi-input --output tmp/hmm/hmm-viterbi-output --model /tmp/hmm/hmm-model --likelihood
複制代碼

The parameters used in the preceding command are as follows:

  • input: This is the input location of the file
  • output: This is the output location of the Viterbi algorithm's output
  • model: This is the HMM model location that we created earlier
  • likelihood: This is the computed likelihood of the observed sequence

The following screenshot shows the output on executing the previous command:


  • Save the predictions from the Viterbi in the output file using the cat command:
  1. cat /tmp/hmm/hmm-viterbi-output
複制代碼


Reference

Learning Apache Mahout Classification By Ashish Ashish Gupta Gupta
關鍵詞:Case study Markov Hidden mahout apache different available learning provided classes

本帖被以下文庫推荐

Edwardu 发表于 2015-6-22 02:31:19 |顯示全部樓層
good book
您需要登錄后才可以回帖 登錄 | 我要注冊

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 論壇法律顾问:王进律师 知識産權保護聲明   免責及隱私聲明

GMT+8, 2019-11-15 10:35