Spending on IoT Expected to Grow to $520B

According to a new report from consulting firm Bain & Co, big companies plan to double their annual spending on smart, Internet-connected devices like video surveillance cameras and industrial sensors by 2021 to an annual total of $520 billion from $235 billion spent in 2017.

In its previous 2016 survey about IoT spend, Bain projected $450 billion for 2020 which included purchases of devices, software, and related services.  The higher forecast shows businesses are increasing their appetite for connected devices alongside  growing consumer demand  for everything “smart” in their homes including cameras, thermostats, speakers and connected light bulbs.

Connected video surveillance cameras and sensors that measure throughput or alert when parts are wearing out, usually send their information to the cloud for analysis. But newer products are moving the smarts to the edge – meaning that the products themselves will have more powerful compute engines and AI algorithms built-in, making them more independent and cost effective to operate – thus boosting sales.

Not surprisingly, Bain found that not all connected devices have caught on as much as previously expected – primarily due to security concerns, complex integration paths and uncertain ROIs.  Many companies hoped that collecting extensive data about their equipment would help with predictive maintenance, reducing costs and streamlining operations.  That has been harder to prove-out.

In one example, Bain pointed to elevator manufacturer Schindler working with GE  to collect sensor data from 60,000 elevators. But a lack of historical data and problems integrating different data formats made predicting maintenance needs difficult which has caused  interest in predictive maintenance use cases to wane.

Interest in remote monitoring, on the other hand, has risen because it tends to be a standalone application with clear customer benefits.  

As Bain notes, “the next few years will be critical to the development of IoT markets as leaders continue to make gains and expand their industry-specific offers. Incumbents that fail to move quickly enough to address customers’ needs are likely to get leapfrogged by more nimble competitors. Device makers, in particular, run the risk of seeing software and analytics competitors capture the value of solutions, leaving them to deliver lower-profitability hardware components.”

Removing barriers to adoption is critical – utilizing proven technology and service providers that understand customer pain points and have experience delivering end-to-end, secure, scalable systems is imperative. 

The Possibilities of Ubiquitous Video Streams

The idea of video cameras everywhere is used to conjure up thoughts of Police States or 1984. Today however, each of us walks around with at least two cameras at the ready.

One on their phone and likely another such as video surveillance of the interior of their homes or overlooking their doorsteps. Cameras are everywhere and the tech giants have been investing huge sums into making this technology cheap, accessible, and ubiquitous.

In March, Amazon announced it was acquiring Ring, the video doorbell company. Several years back, Google had acquired Nest, which then acquired Dropcam and brought it into the fold. The two represent billions of dollars invested in developing both the hardware as well as the infrastructure to support large scale video recording and analysis. Over the same period, dozens of alternative products have come to market such as the WeMo NetCam, Netgear Arlo, and Canary. The video camera on other devices such as the Echo Show and the JIBO also have the flexibility of doubling as cameras for the home.

With so much video data being streamed, it begs the question of what’s possible when you combine multiple streams together along with some of the latest technologies around AI? What can consumers expect of these devices over the next 2-3 years and what are the considerations, especially around privacy, that we’ll need to resolve?

Large advances in hardware technology coupled with new means of processing video have allowed for the costs to exponentially decrease over the years and for the capabilities of these devices to similarly experience exponential growth. Bandwidth, latency, and congestion issues of wireless network technology being addressed means that 4K, 60 FPS video can be streamed without concern about the image being grainy, or buffering.

Behind the scenes, computer vision technology has become commoditized with more service providers offering up the technology and more functionalities being extended to developers. New technology around edge computing may allow for the benefits of computer vision AI with the security of local-based processing.

What can you do today?

In the home, the primary placement of cameras likely include:

  • Baby cams to check on infants and toddlers
  • Doorbell cameras that face out onto a front porch
  • Outdoor cameras looking at backyards
  • Indoor cameras looking at entry ways

Most of the cameras on the market today come with the ability to stream the video to a phone or desktop, backup the video online, take time lapse images, and push voice to the outcome through the camera. Some also have alerting features through app notifications, email, or text message for event triggers such as motion or the identification of a person.

With these features alone, you can already do a lot:

  • Know if a package has been delivered
  • See if anyone is home or has come / gone
  • Check if the surrounding area is safe
  • Get a sense of the environment remotely (e.g. is there light inside yet)
  • Provide voice communication to someone in the area
  • Check if an infant child is sleeping / safe

However, when you start to add more cameras combined with AI, you can abstract a lot more information about the environment.

Google, Amazon, Microsoft, and IBM, among others, now offer computer vision APIs that can be implemented by even a novice developer with extended amazing functionality. These include the ability to:

  • Identify an object
  • Identify a person
  • Understand logos
  • Extract text
  • Determine “inappropriate content”
  • Transcribe the video
  • Identify handwriting
  • Identify smiles
  • Identify emotion
  • Estimate peoples’ ages
  • Identify gesture
  • Identify foods

There is a lot of overlap among the service providers and while today these services are still too expensive for continuous use (they cost pennies per minute), the price will likely drop to pennies per hour or day over the next few years. Even with only this capability, it’s already possible to start extending the applications that are currently available on today’s webcams such as:

  • Tracking a user from room to room
  • Logging when someone arrives home or leaves
  • Tracking the emotion of different people in frame throughout the day
  • Keeping a record of what we’re talking about
  • Tracking visitors to the home

Today, this is achievable without needing to develop new technologies. What’s coming next will reshape how we adopt these devices.

The Next Five Years

The next generation of in-home cameras is going to combine advanced embedded AI features together with a highly reliable connection to online processing. We’ll see Alexa, Google Assistant, and Bixby, among others, embedded into the products and with that, the capability for them to understand what’s happening around us. Maybe we’ll become more comfortable with the idea of live streaming inside our home if the benefits are substantial.

Original blog post by: