How many people still buy books? And, of far greater interest to me, how many people still buy books on IT and technology?
Precise numbers are notoriously difficult to find. the audience insight company Nielsen (or, in some parts of the world, NPD) has maintained their BookScan data for many years. But you need to pay for access and, even if you do, those numbers only cover a subset of the total sales of physical books sold within the US. That leaves e-books and non-US sales unrepresented.
Given that as many as 50% of the world’s books are now sold through Amazon, their Book Sales Ranking (BSR) can be a useful proxy for understanding actual sales. There are even web applications like the Amazon Book Sales Calculator that will take a BSR ranking and estimate how many real sales it represents.
The BSR can be helpful for estimating how well individual books are selling, but there’s no easy way to collect data from across the entire Amazon website. But I wanted to get a sense of big-picture trends in the technology book market, so I created a very small data sample that might just be useful.
To do that, I first singled out the five big mainstream publishers in the IT/programming space: Manning, No Starch Press, O’Reilly Media, Packt, and Sybex (Wiley). I then opened the Amazon page devoted to each of those companies and sorted the results by what Amazon calls “Best Sellers”. For the first 10 titles displayed for each company, I then collected the ranking, publication date, number of pages, and number of reviews for each of those books. Technically, since O’Reilly’s page can’t be organized by Best Sellers, I just took the first eight titles they displayed.
That gave me data on the top sellers for each of what I believe are the five largest mainstream publishers in the technology market. Well, right off the top, it’s clear that not all publishers are equal. As you can see, a book published by No Starch is far more likely to sell at significant rates than one from Packt. Which is bad news for me when you consider how, over the years, I pitched three projects to No Starch and was turned down each time. Most of my Manning and Sybex books have done reasonably well, though.
This scatter plot shows that - at least for the titles we’re talking about - sales don’t seem to fall all that much as a book ages. Of course, that might be because books covering fast-evolving technologies might just be updated more frequently. So older editions won’t show up among a publisher’s best sellers.
Finally, I calculated possible relationships between books’ age, page count, and Amazon reviews to look for correlations. Bearing in mind once again that lower BSR ranking numbers indicate higher sales, we can see that there’s a weak - but identifiable - correlation between the number of reviews a book’s Amazon listing has attracted and better sales (-0.19). This will surprise exactly no one. Of course, lots of negative reviews probably won’t have that impact.
Predictably, the longer a book has been on the market, the more reviews it’s likely to have (0.04). And there’s a tiny, barely visible correlation (-0.06) between higher page counts and improved sales. But I’d say our dataset is way too small to be confident about that one.
It’s not much, I’ll admit. But it does suggest that more and better data could reveal some stronger and more valuable insights. If anyone is able to share larger datasets, I’d be happy to revisit this topic.