Practical Text Mining with Perl (Wiley Series on Methods and Applications in Data Mining)
Содержание:
Приключения Тома Сойера, Приключения Гекльберри Финна, Зверобой, Последний из могикан, Всадник без головы, Жизнь у индейцев, Оцеола - вождь семинолов
Описание:
Provides readers with the methods, algorithms, and means to perform text mining tasks This book is devoted to the fundamentals of text mining using Perl, an open-source programming tool that is freely available via the Internet (www.perl.org). It covers mining ideas from several perspectives--statistics, data mining, linguistics, and information retrieval--and provides readers with the means to successfully complete text mining tasks on their own. The book begins with an introduction to regular expressions, a text pattern methodology, and quantitative text summaries, all of which are fundamental tools of analyzing text. Then, it builds upon this foundation to explore: Probability and texts, including the bag-of-words model Information retrieval techniques such as the TF-IDF similarity measure Concordance lines and corpus linguistics Multivariate techniques such as correlation, principal components analysis, and clustering Perl modules,...Похожие книги