The goal of cognitive computing is to get a computer to behave, think and interact the way humans do. In 5 years, machines will emulate human senses, each in their own special way.
Touch
You will be able to touch through your phone.
by Robyn Schwartz, associate director of IBM Research Retail Analytics, and Dhandapani Shanmugam, solutions architect and Siddique A. Mohammed, software architect of IBM Software Group Industry Solutions.
Within the next five years, your mobile device will let you touch what you’re shopping for online. It will distinguish fabrics, textures, and weaves so that you can feel a sweater, jacket, or upholstery – right through the screen. Haptic devices such as gloves or “rumble packs” used in gaming have existed for years. But we use them in closed environments where the touch doesn’t actually connect to where we are in reality. We at IBM Research think that in the next five years that our mobile devices will bring together virtual and real world experiences to not just shop, but feel the surface of produce, and get feedback on data such as freshness or quality.
It’s already possible to recreate a sense of texture through vibration. But those vibrations haven’t been translated into a lexicon, or dictionary of textures that match the physical experience. By matching variable-frequency patterns of vibration to physical objects so, that when a shopper touches what the webpage says is a silk shirt, the screen will emit vibrations that match what our skin mentally translates to the feel of silk.
Vibrating air to feel like something solid
Using digital image processing and digital image correlation, we can capture texture qualities in a Product Information Management (PIM) system to act as that dictionary. Retailers could then use it to match textures with their products and their products’ data – sizes, ingredients, dimensions, cost, and any other information the customer might expect. The dictionary of texture will also grow and evolve as we grow our appetite, usage and understanding of this kind of technology.
We’re not ready to virtually high-five Tupac Shakur’s hologram through a phone – yet. Soon though, the phone will be able to emit a field of vibrations. Just millimeters from the screen. And the vibrations will be subtle. Your phone won’t shake out of your hand, but will deliver a recognizable sensation. Imagine shopping for a wedding dress on a phone or tablet, and being able to feel the satin gown, or even the intricate beading and buttons, or the lace on the veil.
Beyond the clothing rack
Starting in retail makes sense because we all intrinsically understand the browsing and shopping experience. We all naturally respond to and understand texture, from a soft pair of socks to a ripe piece of fruit.
The touch of something translated, based on accumulated data in a database down to an end user’s mobile device could also have the power to help us gain new understandings of our environment. Take farming, for example. Farmers could use a mobile device to determine the health of their crop by comparing what they’re growing to a dictionary of healthy options that they feels through a tablet.
The technology could evolve beyond communicating textures retrieved from some database, and toward real time touch translation gained from accumulated user interaction with the technology. What is one of the first things a doctor does when treating an injured patient? Touch the injury. The patient could send a photo of an injury to let the doctor feel the injury remotely to help make a faster diagnosis – before or perhaps instead of visiting the doctor in person.
Five years may not seem like enough time to begin feeling what you shop for through your smart phone. But look at how far the smart phone has come. You can see the people you talk to via a camera. You can remotely adjust your home’s thermostat, or set your home alarm system; pay your babysitter; find your way to a local pizza restaurant; and watch a movie. Why can’t it also help you stay in “touch” with your environment?
Sight
A pixel will be worth a thousand words.
by John Smith, IBM senior manager, Intelligent Information Management.
They say a picture is worth a thousand words, but for computers, they’re just thousands of pixels. But within the next five years, IBM Research thinks that computers will not only be able to look at images, but help us understand the 500 billion photos we’re taking every year (that’s about 78 photos for each person on the planet).
Getting a computer to see
The human eye processes images by parsing colors and looking at edge information and texture characteristics. In addition, we can understand what an object is, the setting it’s in and what it may be doing. While a human can learn this rather quickly, computers traditionally haven’t been able to make these determinations, instead relying on tags and text descriptions to determine what the image is.
<p></p>
One of the challenges of getting computers to “see,” is that traditional programming can’t replicate something as complex as sight. But by taking a cognitive approach, and showing a computer thousands of examples of a particular scene, the computer can start to detect patterns that matter, whether it’s in a scanned photograph uploaded to the web, or some video footage taken with a camera phone.
Let’s say we wanted to teach a computer what a beach looks like. We would start by showing the computer many examples of beach scenes. The computer would turn those pictures into distinct features, such as color distributions, texture patterns, edge information, or motion information in the case of video. Then, computer would begin to learn how to discriminate beach scenes from other scenes based on these different features. For instance, it would learn that for a beach scene, certain color distributions are typically found, compared to a downtown cityscape, where certain distributions of edges are what make them distinct from other scenes.
Once the computer learns this kind of basic discrimination, we can then go a step further and teach it about more detailed activities that could happen within the beach scene: we could introduce a volleyball game or surf competition at the beach. The system would continue to build on these simpler concepts of what a beach is to the point that it may be able to distinguish different beach scenes, or even discern a beach in France from one in California. In essence, the machine will learn the way we do.
Helping doctors see diseases before they occur
In the medical field where diagnoses come from MRI, X-Ray and CT images, cognitive visual computing can play an important role in helping doctors recognize issues such as tumors, blood clots, or other problems, sooner. Often what’s important in these images is subtle and microscopic, and require careful measurements. Using the pattern recognition techniques described above, a computer can be trained to effectively recognize what matters most in these images.
Take dermatology. Patients often have visible symptoms of skin cancer by the time they see a doctor. By having many images of patients from scans over time, a computer then could look for patterns and identify situations where there may be something pre-cancerous, well before melanomas become visible.
Share a photo – get better discounts It’s not only images from specialized devices that are useful. The photos we share and like on social networks, such as Facebook and Pinterest can provide many insights. By looking at the images that people share or like on these social networks, retailers can learn about our preferences – whether we’re sports fans, where we like to travel, or what styles of clothing we like – to deliver more targeted promotions and offer individualized products and services.
Imagine getting promotions for kitchen gadgets or even certain kinds of food based on the images pinned to your “Dream Kitchen” Pinterest board.
Using Facebook photos to save lives
Sharing photos on social networks is not only beneficial for retailers and marketers, it could also help in emergency management situations. Photos of severe storms – and the damage they cause, such as fires or electrical outages – uploaded to the web could help electrical utilities and local emergency services to determine in real time what’s happening, what the safety conditions are and where to send crews. This same type of analysis could also be done with security cameras within a city. By aggregating all of the video data, police datacenters could analyze and determine possible security and safety issues.
In five years, computers will be able to sense, understand, and act upon these large volumes of visual information to help us make better decisions and gain insights into a world they couldn’t previously decipher.
Hearing
Computers will hear what matters
by IBM Master Inventor Dimitri Kanevsky.
Imagine knowing the meaning behind your child’s cry, or maybe even your pet dog’s bark, through an app on your smartphone. In the next five years, you will be able to do just that thanks to algorithms embedded in cognitive systems that will understand any sound.
Each of a baby’s cries, from pain, to hunger, to exhaustion, sound different – even if it’s difficult to tell. But some of my colleagues and I patented a way to take the data from typical baby sounds, collected at different ages by monitoring brain, heart and lung activity, to interpret how babies feel. Soon, a mother will be able to translate her baby’s cries in real time into meaningful phrases, via a baby monitor or smartphone.
Predicting the sound of weather
Sensors already help us with everything from easing traffic, to conserving water. These same sensors can also be used to interpret sounds in these environments. What does a tree under stress during a storm sound like? Will it collapse into the road? Sensors feeding the information to a city datacenter would know, and be able to alert ground crews before the collapse.
Scientists at our Research lab in Sao Paolo are using IBM Deep Thunder to make these kinds of weather predictions in Brazil.
These improvements in auditory signal processing sensors can also apply to hearing aids or cochlear implants to better-detect, extract, and transform sound information into codes the brain can comprehend – helping with focus, or the cancelation of sounds.
Forget to hit “mute” while on that conference call at work? Your phone will know how to cancel out background noise – even if that “noise” is you carrying on a separate conversation with another colleague!
Ultrasonics to bridge the distance between sounds
Sound travels at 340 meters per second across thousands of frequencies. IBM Research also wants to take the information from ultrasonic frequencies that we humans can’t hear, into audio that we can. So, in theory, an ultrasonic device could allow us to understand animals such as dolphins or that pet dog.
And what if a sound you want or need to hear could cut through the noise? The same device that transforms and translates ultrasonics could work in reverse. So, imagine wanting to talk with someone who, while only a short distance away, is still too far away to yell (say, from across a crowded room). A smartphone, associated with an ultrasonic system, could turn the speaker’s voice into an ultrasonic frequency that cuts through sounds in the room to be delivered to, and re-translated for only the recipient of the message (who will hear the message as if the speaker was standing close by – no receiving device needed).
This ultrasonic capability could also help a police officer warn a pedestrian to not cross a busy road, without shouting over the traffic noise. And parents could “call” their children to come in from playing in the neighborhood when it’s time for dinner – without worrying if their children’s cellphones were on or not.
Taste
Digital taste buds will help you to eat smarter
by IBM’s Dr. Lav Varshney, research scientist, Services Research.
An extraordinary dining experience of perfectly cooked food, with unique flavor combinations meticulously designed on a plate, heightens all of our senses.
But we may not realize that the way we perceive flavors and the characteristics of a “good” meal are fundamentally chemical and neural. In five years, computers will be able to construct never-before-heard-of recipes to delight palates – even those with health or dietary constraints – using foods’ molecular structure.
Lessons from Watson: inductive reasoning
Whereas traditional computing uses deductive reasoning to solve a problem with a definitive answer, our research team uses inductive reasoning to model human perception. Watson was a concrete example of this inductive type of computing system to interpret natural language and answer vague and abstract questions.
Our team is designing a learning system that adds one more dimension to cognitive computing: creativity.
The system analyzes foods in terms of how chemical compounds interact with each other, the number of atoms in each compound, and the bonding structure and shapes of compounds. Coupled with psychophysical data and models on which chemicals produce perceptions of pleasantness, familiarity and enjoyment, the end result is a unique recipe, using combinations of ingredients that are scientifically flavorful.
So unlike Watson, which used known information to answer a question with a fixed answer, this system is creating something that’s never been seen before. It’s pushing computing to new fields of creativity and quickly giving us designs for novel, high-quality food combinations.
Picky eaters, dietary restrictions and social Impact
Obesity and malnutrition pose severe health risks for populations around the world. Efforts to combat these issues have reached schools, where cafeteria lunches, for example, are getting a bad rap: federal mandates have swapped cookies for green beans, french fries for apples, and pizza for low-fat, low-sodium fajitas, with food often ending up in the trash instead of the student. Likewise, for meals at hospitals and nursing homes.
My team believes if you can optimize flavor while meeting nutritional constraints, you can mitigate health issues. For food service companies, creative computers can come up with flavorful meals that also meet predetermined nutritional objectives – so rather than throwing the meal away and heading for a bag of potato chips in the vending machine, students would eat a healthy meal they actually enjoy.
Many communities in sub-Saharan Africa only have access to a few base ingredients for any given meal. But limited resources should not eliminate the enjoyment of food. A creative computer can optimize flavor profiles within these constraints, creating a variety of never thought of meals that please the palate, encourages consumption, and helps prevent malnutrition.
There’s what in my quiche?
Our culinary creation system has access to large databases of recipes from online, governmental, and specialized sources. The repository allows the system to learn what we consider to be good food. For example, from 50 recipes of quiche, the system can infer that a „good“ combination of ingredients for any variation of quiche would include eggs, at least one vegetable, and three spices.
With an understanding about what quiche is, and access to information about a world of other ingredients, the system can create a completely novel quiche. Perhaps a quiche that uses venison, fenugreek and sandalwood?
Borrowing methods from psychology and information theory, the system can compute how surprising this new recipe is compared to previous knowledge. If the new recipe is also flavorful and healthy, a chef might consider putting it on her menu.
How did we get here? Everyone eats and food is central to who we are. So, it would be very powerful if we can enhance this human experience in such a visceral way.
From a computing perspective, it is pointing us in a completely different direction around machine creativity. With a research team that even includes a professionally trained chef-turned-computer-engineer, we believe that in five years, amazing meals will be created with the help of cognitive systems.
Smell
Computers will have a sense of smell
by IBM Research’s Dr. Hendrik F. Hamann, research manager, physical analytics.
Within the next five years, your mobile device will likely be able to tell you you’re getting a cold before your very first sneeze.
With every breath, you expel millions of different molecules. Some of these molecules are biomarkers, which can carry a plethora of data about your physical state at any given moment. By capturing the information they carry, technology can pick up clues about your health and provide valuable diagnostic information to your physician.
What’s that smell?
In this evolving new era of cognitive computing, computers are increasingly able to process unstructured data, draw conclusions based on evidence, and learn from their successes and mistakes. This makes them progressively more valuable diagnostic tools to help humans solve problems and answer questions.
However, to learn – one has to sense first
Tiny sensors that ‘smell’ can be integrated into cell phones and other mobile devices, feeding information contained on the biomarkers to a computer system that can analyze the data.
Similar to how a breathalyzer can detect alcohol from a breath sample, sensors can be designed to collect other specific data from the biomarkers. Potential applications could include identifying liver and kidney disorders, diabetes and tuberculosis, among others.
The level of sensitivity of a sensor will depend on a number of factors, including how big they need to be and what type of data is being detected. We have already demonstrated in the lab a number of examples where relatively simple sensing systems can measure biomarkers down to a single molecule.
Understanding the data
There are, at the very least, two key components to having a computer understand what the sensors capture.
A computer has to be able to constantly learn, as well as combine new and old information from a number of sources. To do this effectively, new generations of computers are required – computers containing new devices, circuitry and architectures through which data can be processed in a massively parallel fashion.
Where in the past, physicians relied on visual clues and patient descriptions to form a diagnosis, just imagine how helpful it will be to have the patient’s own body chemistry provide the clues needed to form a more complete picture.