Хабр Курсы для всех
РЕКЛАМА
Практикум, Хекслет, SkyPro, авторские курсы — собрали всех и попросили скидки. Осталось выбрать!
def test(classifier, test_set):
hits = 0
for feats, label in test_set:
if label == classify(classifier, feats):
hits += 1
return hits/len(test_set)
def get_features(sample): return (
'll: %s' % sample[-1], # get last letter
'fl: %s' % sample[1], # get first letter
'sl: %s' % sample[0], # get second letter
)
if __name__ == '__main__':
samples = (line.decode('utf-8').split() for line in open('names.txt'))
features = [(get_features(feat), label) for feat, label in samples]
train_set, test_set = features[:-100], features[-100:]
classifier = train(train_set)
print 'Accuracy: ', test(classifier, test_set)
from sklearn.tree import DecisionTreeClassifier
# ...
def get_features(sample): return (
ord(sample[-1]), # get last letter
ord(sample[0]), # get first letter
ord(sample[1]), # get second letter
)
# ...
clf = DecisionTreeClassifier()
clf.fit(
[i[0] for i in train_set],
[i[1] for i in train_set]
)
hits = 0
for feats, label in test_set:
if label == clf.predict([feats]):
hits += 1
print 'Accurancy DecisionTreeClassifier: ', hits/len(test_set)
Наивный Байесовский классификатор в 25 строк кода