British science-fiction writer and futurist Arthur C. Clarke, one of the great visionaries of the twentieth century, predicted, amongst other things, global access to information, instant search algorithms and even smartphones. Above all, however, he coined what is known as Clarke’s Third Law: “Any sufficiently advanced technology is indistinguishable from magic.” And voice technology is indeed indistinguishable from magic.

In 1971, the United States Department of Defence began funding research to develop a machine, the Harpy, that could comprehend up to 1000 words. However, the first patent for speech recognition technology was filed over a decade later, in 1983, by Italian researchers Michele Cavazza and Alberto Ciaramella.

Throughout the eighties and nineties, various speech recognition projects were implemented by government agencies and large telecommunications companies, but the first consumer product – a dictation software programme – did not appear until 1990.

It was not until the beginning of the new millennium, however, that speech recognition systems began to gain ground amongst consumers. The first application was dictation, but these programmes were slow and required long periods to perfect language acquisition capabilities. Other services, such as automatic translation and human-sounding text-to-voice reproduction, continued to prove too complex.

Indeed, the turning point in the application of voice technology has been very recent. The rise of the cloud and access to big data has allowed voice recognition systems to access information, “learn” more rapidly and function far more efficiently and precisely.

The first voice recognition system on a smartphone was introduced in 2008 and just a few years later all major digital players began to pioneer digital assistants capable of executing commands, providing information and dictating notes, appointments and articles. Voice control is currently also being integrated into home domotics systems and vehicle manufacturers are developing systems that allow users to operate infotainment, navigation, and heating/air-conditioning in their cars. Indeed, vehicles are an optimal environment for voice-controlled services, allowing us to keep our hands on the steering wheel, at least until self-driving cars appear on the market.

Speaking to our devices is a fundamental milestone in the evolution of technology. We are slowly shifting towards a more natural form of interaction with digital devices and tools. It is the “magic” that is eliminating the need for user interfaces such as keyboards, switches and even touch surfaces. And while to a certain extent this technology is already in place, the future will be characterized by the disappearance of user interfaces and the evolution of a seamless human-machine environment.

Screens will probably survive for the moment, as we still like to make sure that our voice commands have been received correctly. And that’s OK as, for the foreseeable future, voice technology will continue to take advantage of smartphones – the true hubs of our current digital environment – to control home automation, satellite navigation and insurance telematics apps, just to name a few of the major early adopters in this field.

Naturally, as with all modern digital technology, voice technology presents a range of issues related to privacy and data protection. As this technology progresses and leaves our smartphones to take advantage of ubiquitous listening devices in our houses, vehicles, offices, and even in public facilities, what will happen to our questions and information requests as they are absorbed into the cloud? Who will control this information? Who will have access to it?

Soon, we will be able to ask these questions directly to the cloud. And, then, the magic trick will be perfect.