Taming Text: How to Find, Organize, and Manipulate It

Grant S. Ingersoll
322
October 06, 2014
★★★ (-21.05%) 🛈

The book alternates between a great overview of the subject, and getting down-and-dirty with the code. I don’t know a good solution to this, but I was less concerned with the code and more concerned with the overhead view.

It’s a good discussion of how to manage text: how to tokenize it, search it, cluster it, and classify it. Towards the later chapters, it bogs way, way down – Chapter 7, on classification, is probably a quarter of the entire book in length. In many cases they “dropped to code” immediately without discussion the theory much.

In the end, I got what I came for – I understand the basic concepts of text processing. I now know what I don’t know, and that’s often half the battle.

This is item #465 in a sequence of 515 items.

You can use your left/right arrow keys or swipe left/right to navigate