Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have written a program to identify the poet who has written a given poem. To be more specific, I have a training set of 30000 lines of poem and 3 poets in general, and for each poet, there are 10000 lines (verses) of poem in my training set. Ok. I trained model, using backoff model :

Backoff formula

That is, for any poet, first I count how many times a word is repeated in his poem in the training set, and then for each combination of two words, I also calculate their frequency. That is, given word a has appeared in the text, what is the probability of word b coming after? this is known at bigram. And finally, I modify the probability and compute P-hat for each poem and say that the test verse is for the poet who has the higher probability compared to the other two. The problem is that when I decided to tune hyperparameters, I came across this problem that when lamdas are kept steady and only epsilon is changed, the accuracy varies. But It should not, because since lamdas are not changed, changing epsilons only shifts all p-hats the same amount, and should not have any impact on accuracy. I assume that I have implemented the code right because I already have reached 85% accuracy which is good for this kind of model. This is the result of comparing three models, keeping lambdas constant but changing the epsilon :

comparing models

question from:https://stackoverflow.com/questions/66067970/accuracy-decreases-when-i-increase-epsilon-in-backoff-nlp-model

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
812 views
Welcome To Ask or Share your Answers For Others

1 Answer

Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...