So what’s going on while John searches for more monster mayhem? mBG treats each book on the shelf as a positively labeled instance of the concept, “Monsters Not Lovers.’’ To find books to recommend, mBG selects a book from the shelf at random, and then queries its database for a list of the most similar books for which we don’t know the class — that is, we don’t know if it belongs on the shelf or not. (Similarity is based on subject, genre, and page length.)
We could stop there and recommend that book to John. But the odds that it is a good choice are low, so we check using an algorithm called “k-nearest neighbor.” This algorithm finds the k books that are most similar to the book we’re trying to categorize. Then it counts how many of those k books are on the shelf, how many aren’t, and how many are unknowns. If the majority belong on the shelf, then that’s a pretty good indication that the book mBG chose does too, and so the system recommends it to John.
What if the book doesn’t seem to belong on the shelf? The mBG system will continue to check the books most similar to shelf members. If it gets through all of the nearby books, and still hasn’t found one to recommend, instead it suggests the book with the highest score, calculated by subtracting the number of negatives from the number of positives.