Thinking About Data, Biases, & Discussions About Camera Gear

27Apr2017

10 min read

You’ll have to forgive me, I have a bit of a rant here with on the level of discussion on the Internet around cameras. Don’t get me wrong, I like talking about cameras and other camera gear, and even to some extent the apparent strategies of various manufactures. Heck, I’ve even written a series of posts advancing ideas that I’d like to see implemented by camera makers. In fact, it’s probably safe to say that a good percentage of us photographers like engaging in these kinds of discussions.

At the same time, I find myself continually annoyed by the level of discourse on these kinds of topics. A large part of that,I believe, stems from the near complete lack of good data and information to work in these kinds of discussions. As a result, it’s very easy to make claims that are both unsupported and subject to significant levels of bias.

It’s not just commenters on various discussion forum who are “guilty”, if you can even call it that, of not being able to discuss things effectively. Many authors of articles published by the photography press have fallen in to the same traps of not having enough data to properly contextualize things they’re talking about — or worse just going for outright sensationalism like so much of the rest of the media.

Certainly there is data out there on things like the demand for features and actual market share rates at a segment level. However, it’s almost always not publicly available, and in many cases almost certainly proprietary and so will never become publicly available.

NPD, for example, is a market research firm that provides some of this kind of data to various camera companies. For example, the were the source Sony used to claim that they were the #2 best selling full frame camera maker in the US for January and February of 2017. However, whatever data they may have, we, the general photographic community, doesn’t have access to.

Likewise, companies (Canon seems to be the favorite target right now) are frequently lambasted for not providing their “customers” what their customers want. It’s easy to make this claim, it’s much harder to actually support it.

As a registered Canon camera owner (and separately as CPS member) I get a couple of surveys a year about how I use my gear and what features I’m most interested in. Sometimes those surveys are entirely useless to me — no Canon, I’m never going to edit and correct my images on my smartphone. Other times they touch on topics I do care about. However, as a result of these, Canon certainly has data on the needs and desires of their users. Again, we, the general photography community, don’t have access to this proprietary information but Canon, at least, is likely not guessing wildly when they do things.

Since we don’t really have access to the data the manufacturers and market research firms have, in order to bolster our arguments on various discussion forums, we have to result to less complete data from other sources.

This creates a separate problem; is the limited data representative of the whole?

As the meme goes, “The plural of anecdote is not data,” and there’s a certain truth too it. For a representative sample to valid, the data has to actually represent the population it’s trying to cover.

Finding a relevant sample population is a challenge in and of itself, and one has to be actively aware of biases and how they can affect data collection to do this.

One example of this going wrong that really sticks out in my mind, was a discussion centered around removing the useless video capabilities in VDSLRs. One poster claimed that video was entirely useless and should be removed. To back this claim up, he offered up the anecdotal evidence that he informally surveyed the members of his photography club and none of them used video capabilities at all.

It shouldn’t take a statistician to point out the obvious; people who use their VDSLRs primarily for video, aren’t likely to be members of photography clubs. Even if we take his claim as being true (which I have no problem doing), the sample is not representative of all the people buying VDSLRs. As a result, it can’t be used to speak for the validity of the feature for all VDSLR buyers.

This is as good as any point to segue into biases.

Let met start out by being very clear, biases, as I’m talking about them here are amoral. They’re not inherently right or wrong, they’re just a product of the way our brains are naturally wired. If we’re cognizant of them, we can work to minimize them as they impact of decisions or discussions, but that’s not always necessary.

As one example, I’m inherently biased towards Canon gear. It’s what I own and what I find works best for me (confirmation bias). However, I’m fully aware that I’m inherently biased towards Canon, and so I can, and do, work to insure that I’m not letting that influence me when I’m writing about another manufacturers products.

Going back to my previous example (video features), it’s also a great example of two biases that affect virtually everyone unless they’re aware of them and actively work to exclude them.

The first bias demonstrated here is selection bias. Selection bias is an inherent bias to tend to select groups that support your positions. The resulting sample isn’t representative of the whole because only a specific community was involved. A photography club may be representative of photographers, but photographers clearly aren’t the only people buying VDSLRs.

Selection biases can be obvious and easy to identify, but isn’t always. Some people taking part in these discussions work for camera stores and have used their stores sales to indicate supposedly larger trends. But what kind of business do these stores do and who are their clientele? That right there is a not so obvious level of selection bias.

Of the two actual camera stores near me, at least as best as I can tell, one’s business is dominated heavily by pros and the others is dominated largely by wealthy amateurs. Moreover, neither have effective online presences.

The pro oriented one is going to over-represent pro needs compared to the general population. The wealthy amateur one is going to over-represent (based on having been there) Leica and high end products compared to the population on the whole. And neither of them, even in aggregate, represent the people who prefer to buy online for convenience, price, or not having to potentially interact with pushy salesmen.

The other bias is projection bias; put simply, this is our inherent nature to believe that others have the same needs and goals as we do. The whole line of discussion, that features are useless because someone says so, is an example of projection bias.

Project bias is also one of the major areas that having statistically meaningful data actually combats. It becomes considerably harder to make a serious argument that a feature is useless or that needs aren’t being met when data says that people are using that feature.

That actually brings me to the final point I want to talk about with respect to data and discussions. In an effort to frame an argument or create a story, frequently conclusions are drawn from data that the data doesn’t actually support.

One popular topic that continuously comes up is the oft-publicized death of the camera industry. Almost always this is based on annually published sales figures from groups like the CIPA (Camera & Imaging Product Association). Frequently, there will be stories published using this data claiming that various things (smartphones, mirrorless cameras, garden gnomes) are killing the camera industry, and will cite the CIPA sales numbers as their evidence.

Frequently, these discussions will also fail to consider other factors as well. One big example, while many sites in the photo press ran yet another story about the camera industry dying in 2016 because of further decreased sales. The vast majority of them also seem to have forgotten that Sony’s sensor fabrication plant in Japan was destroyed by an earthquake. Sony manufactures sensors for not only themselves, but pretty much everyone else except Canon. Without a large supply of sensors, production would have been down markedly, and with fewer products to sell, so would sales.

When it comes right down to the problem of data, I don’t have an answer to these questions. There are certainly places we can go to get some information. And these can shed some light on the discussion:

The CIPA publishes camera sales information information broken down by month, market, and camera type, and in units and yen. (Data goes back all the way to the 1950s.)
Photosynthsis.co.nz has a good estimate of Nikon lens sales by model based on known serial numbers. Some Nikon camera data can also be found on that link.

(If and when I come across more and better primary grade sources, I’ll endeavor to add them to this list. If you know of any, please leave a comment.)

There are of course other slightly more limited sources; Flickr activity for example can be used to draw some inferences about popularity of various camera brands. However, Flickr’s data is not without selection bias (only counts Flickr users, video isn’t supported so no video users will be using Flickr).

In any event, now that I’ve written about the problems associated with talking about features and how well manufactures are serving the market, over the next couple of weeks, I’m going to be venturing into this bias laden data deprived area and talking about some things that have recently been bugging me. That said, I wanted to talk about biases and data first. In no small part because I know I’m going to end up using what limited data we do have to try and support my assertions and explanations and I wanted to be up front about the the limitations that I have to work with.

Comments

There are no comments on this article yet. Why don't you start the discussion?