Tomcat
Professional
- Messages
- 2,689
- Reaction score
- 913
- Points
- 113
The article presents information about a new information technology on the Internet — a voice assistant. The forms, types, features and characteristics of voice assistants presented by different companies are studied. The voice assistant is defined as a new information technology of the current state, both in the real and virtual world. A study of the voice assistant is carried out on the subject of the advantages and disadvantages of this information technology.
Keywords: voice assistant, Internet, information security, user, arms race, sound, artificial intelligence, human speech.
For decades, humanity has dreamed of a voice interface, as described in science fiction. And now, thanks to new information technologies and the Internet, virtual voice assistants have appeared and are gaining popularity among users around the world.
For many companies, voice assistants have become not just a point of contact with consumers, but an additional new channel of communication. It becomes possible to conduct a necessary (verbal) dialogue with a person (user), which helps to obtain additional information about him, as well as to create a new experience of interaction. [1]
The relevance of the issue under study is that the integration of voice assistants is actively being introduced into various types of human or business activities and is gradually becoming an integral part of highly effective and cutting-edge interactive marketing communications. [2]
The history of voice assistants began in the late 1930s, when, according to experts, scientists first attempted to recognize a person’s voice. The Bell voice number recognition system was first announced. Some time later, the world was shown a new tool (system) called Shoebox, which IBM presented at the World’s Fair in Seattle (USA) in 1962. The tool was capable of performing mathematical functions, and among other things, identifying sixteen spoken words and numbers from 1 to 9.
The next step was the "Harpy" solution (system), developed by scientists from Carnegie Mellon University in Pittsburgh, Pennsylvania, USA in the 1970s, which has already recognized more than a thousand words, i.e. the approximate vocabulary of a three-year-old child.
As technologies emerged that recognized word sequences, companies began developing applications for them. In the 1990s, companies like IBM, Apple, and others developed solutions that used voice recognition: in 1993, Apple released the Macintosh with PlainTalk technology, and in April 1997, Dragon presented a solution that could convert up to a hundred words per request per minute.
Further developments took place in the development of smart home solutions: in November 2014, Amazon presented the smart speaker Alexa, two years later, in November 2016, Google released the Google Home system, and in February 2018, Apple entered this market again.
Currently, the most comprehensive product solution in the field of real research has become the new information technology "voice assistant", since the described solution includes the use of all existing voice technologies. They include voice recognition systems, speech analysis and processing systems, text-to-speech systems and voice biometrics. [3]
At the same time, according to experts, this "voice assistant" technology has become generally available today and can be written in almost any programming language. However, the most popular in 2020 was the high-level programming language "Python". The subsystem is activated by entering a voice command. The received signal is converted into digital form and filtered by external noise. Among other things, the converted signal is sent to the identification subsystem.
In this subsystem, a query is first made to the signal database to recognize the command. If the input and stored signals match, the detection is considered successful and the command is transmitted to the executing device, which performs a certain action. If the voice command is not recognized, the system returns to the beginning - input of the voice command, and the algorithm of actions is repeated again until a positive result is obtained, that is, until the voice command is recognized. [4]
Thus, we can say that a voice assistant is a modern service based on artificial intelligence that recognizes human speech. Such assistants have every chance of performing various actions in response to voice commands.
Voice assistants are most often used in smartphones, smart speakers, and, among other things, in some modern browsers. Currently, there are several common voice assistants, each of which has its own strengths and weaknesses. [5]
For example, home voice assistants are small speakers that can be installed in any convenient place for the user. For mobile devices, special applications are used that need to be downloaded to the device.
The first place is occupied by the voice assistant "Alice", supported by the program "Yandex Alice". Initially, the system (speaker) has Russian language installed, so it will be convenient to use the device. The device is suitable for users of "iOS", "Android" and can be used to monitor the weather, play music and quickly find the answers you need. The program quickly searches for answers on the Internet and offers the most suitable option.
Has positive aspects:
It also has some negative aspects:
Another voice assistant, less popular, is the Google Nest Mini (2nd gen), which is an affordable speaker used in the car and at home. The Google Assistant is already installed on the purchased device, connected via Wi-Fi or Bluetooth. The device can be mounted on the wall, and thanks to three speakers, users can get the information they need anywhere in the house. The device allows you to receive the most relevant information and listen to music.
The positive aspects are: 1. Clarity of sound; 2. Quick formation of answers to the questions asked.
Disadvantages: 1. Need for language settings; 2. English language pack is pre-installed.
Third place goes to the assistant (voice assistant) "Apple Siri", installed in special speakers or on another "Apple" device. For home use, it is advisable to use a speaker equipped with sensitive speakers that quickly recognize commands. It can also be used in a car. The program can recognize almost all types of languages. If desired, it can answer questions with a male or female voice. This system is widely used in cell phones. It works quickly on "iOS: iPhone", "Apple TV", "iPad" and "Apple Watch" devices. You can use the voice assistant to manage calls, music and applications.
It has positive aspects: 1. Clarity of sound; 2. Quick command recognition; 3. Ease of control.
Cons: Can only be used with Apple devices.
Thus, it can be said that the new information technology “voice assistants”, created specifically for personal computers, help people (users) search for the necessary information.
And in this case, we can say that, according to experts, the first place in this cluster is occupied by a home voice assistant called "Gorynych", which simplifies the use of a computer.
Using voice commands, it is possible to control applications and the mouse, and the dictionary size allows you to clearly recognize commands. If desired, the dictionary can be periodically replenished with new phrases. You can download the assistant in absolutely any browser.
In turn, it is necessary to indicate that the assistant takes up a small amount of memory, for this reason it will not affect the speed of the personal computer.
Positive aspects: 1. Quick search of requested information; 2. Possibility of text input; 3. Launching applications present on the personal computer.
Negative aspects: no disadvantages have been identified by users or specialists.
The second place is taken by the assistant (voice assistant) "Cortana", provided for the "Microsoft" system. It is supported by such systems as "Android", "Xbox One", "Microsoft Phone" and "Microsoft Band". Therefore, it can be installed on tablets. With the help of "Cortana", you can plan your day, find out the route or find the required data, as well as read emails and find the desired music. Among other things, the assistant can open the desired application if the user's hands are busy.
Pros: 1. Easy to install. 2. Applies to computers and phones.
Disadvantages: 1. Users and specialists have not identified any disadvantages.
In third place is a relatively new domestic program (voice assistant) "Marusya", which is just beginning to gain popularity. The assistant can be installed on personal computers, tablets or smartphones. The program was released by "Mail.ru Group". With the help of the assistant, users have every chance of quickly finding the necessary information on the Internet. Among other things, the assistant will notify you of an important date and plan your day down to the minute.
Pros: 1. High speed of request processing; 2. Applies to computers and phones; 3. Extremely easy to install on a device.
Disadvantages: can only be used if you have Russian citizenship.
Along with the information presented, I would like to note that voice assistants are the most frequently used smartphones.
Through applications, users have every chance to open additional tabs and use their mobile device in speakerphone mode.
In first place is the rather popular program "Google Assistant". The most popular system for smartphones can recognize up to three dozen languages, supports such systems as browsers "Android", "iOS" and "Chrome". To use it, you need to install and activate the program with the phrase "Okay, Google". Thanks to the assistant, the necessary route is laid out, you can find out the weather or call if the user's hands are busy. You can write a message, find and open the desired application.
Positive aspects: 1. High speed of work; 2. Quick search of necessary information.
Cons: Generates too brief responses to queries.
In second place is the modern voice assistant "Dusya", which can be activated by voice or by waving a hand. The user can create tasks in advance and independently perform them in the required mode. The assistant will promptly remind you of important matters and send a message. Also, with the help of a smart assistant, you can find out the information you are interested in.
Positive aspects: 1. Easy to operate; 2. Possibility to independently create functions; 3. Possibility to work without a voice command.
Cons: There is no free format for use.
Third place goes to the assistant "Amazon Alexa". This application should be classified as universal and is often installed on a mobile phone. At the same time, the voice assistant is also very popular in home management. The application allows you to write letters, set your favorite music, search for information on the Internet, etc.
For home use, you need to buy a special device and with its help you can open blinds, turn on lights, music and open doors. After installation, the device is activated and responds exclusively to certain voices.
Pros: 1. Universal use; 2. Functions quickly; 3. Can open applications.
Negative aspects: no disadvantages have been identified by users or specialists.
The voice assistants listed above are extremely popular.
However, one cannot fail to mention special voice assistants from domestic developers:
Thus, it is necessary to note that these assistants are actively entering our lives and every day more and more people use them in everyday life.
But… when choosing a voice assistant, you need to consider the functions that are important to the user. Each voice assistant has its own individual functionality, which is systematically updated. When choosing the right intelligent system, it is very important to consider individual criteria. [6]
Experts have tested some systems, such as Google, Alice, Siri, software from Microsoft and Amazon, which were displayed on the first page of search engines. In their opinion, Google Assistant and Yandex Alice became especially popular. These voice assistants gave more correct answers to experts' questions. Meanwhile, experts say that not all voice assistants are ideal and thus need to be improved.
In this regard, the integration of voice assistants into various devices in the future may lead to the fact that marketers will need to adjust their approach to communicating with users, making it more personal, but this problem can only be solved if companies find a way to effectively protect voice assistants from fraud. [7]
However, even now, the modern voice assistants that have been created have made it possible to significantly reduce the time spent on performing simple, routine daily tasks, and this has been shown by the studies presented above.
At the same time, the functionality that voice assistants have is quite extensive, so such services provide communication with the user, searching for information on the Internet and short answers to user requests, calling a taxi, calling and writing messages, turning on music and working the alarm clock, creating a route with a search for the required objects along the way. Also, voice assistants always take into account the user's location, time component and day of the week, and among other things, the service is carried out taking into account the history of all previous user requests.
As an example, one can imagine individual sectors of the economy where they have found application.
Voice assistants are currently present in various fields and industries, and they are often integrated in the most non-standard formats. For example, Zyrtec is a well-known brand of anti-allergy products that hired a voice assistant that tells users useful information about risk factors for people suffering from allergic reactions. Virtual bots from Tide can teach how to remove stains, and a feature of the digital assistants of the job search service on the hh.ru website is the selection of job offers or the provision of information on the average salary. At the same time, Nike has also decided to make life easier for customers, so you can tie your shoelaces using voice control using an iPhone or smartwatch.
In addition to purely “image” solutions and their implementation, there are currently industry areas in which voice technologies have a direct impact on the level of efficiency and the performance of very specific tasks.
For example, from the retail point of view, online purchases are made through voice assistants, and North American retailers use voice technologies for the purpose of fully automating trade processes. At the same time, it is no secret that the popularity of voice technologies in the field of online shopping is due to the increase in customer comfort: a search engine can easily answer the question: "Where to buy a coat at the most affordable price?" or competently advise on the choice of goods.
On a global scale, the most advanced voice assistants are gradually becoming an integral part of the team (company) in those niches of the retail trade network that are characterized by rapid reuse of customer orders.
Among the trendsetters at present, we can note well-known leaders in their industries, such as Amazon and McDonald's.
Along with this, it can be noted that voice assistants are a very powerful tool for obtaining data, monitoring and improving the quality of service, optimizing processes and monitoring compliance with corporate standards. For example, the development and implementation of the Russian Post robot based on the Yandex SpeechKit technology of the Yandex.Cloud cloud platform has shown the effectiveness of its work in such areas as accepting an appeal, determining the root cause of an incident, as well as standard registration and assignment of an appeal number when sending any user request to the enforcement stage.
In the banking sector, voice requests can be used almost in the same way as in retail, i.e. at the level of recurring orders, including microloans. Voice assistants have already proven their high efficiency in the field of medicine. Voice interfaces are becoming personal doctors with vast experience: their “memory card” is practically inexhaustible.
Smart assistants will not only competently consult the user, but also professionally prescribe a treatment plan. At the same time, robots rarely forget or lose sight of anything while processing tests. One of the brightest examples of this approach today is the medical system "Triad Health AI", which uses "Google Home" and "Amazon Alexa" in its work to treat Parkinson's disease.
Along with this, it can be said that it is also obvious that highly specialized niches, including the sale of engineering equipment, will remain on the so-called technological periphery for a long time. Such areas have not yet entered the mass format of online sales, not to mention the use of voice assistants.
Thus, it is necessary to note that voice assistants in the future will actively develop, modernize and improve, will find new areas of application, but one important aspect will develop more and more actively, i.e. as soon as the voice assistant recognizes speech and translates it into text format, it must understand what kind of answer the user expects from it.
For example, in Alice, client queries are initially sorted by intent, and only then are they redirected to thematic sections, including listening to music files or informal conversation. Thus, the key task of the intent classifier is to determine what exactly the user meant by his phrase. In the intent classifier, the query is divided into words and punctuation marks. For these, embeddings trained on very big data or special word representations are used, allowing us to understand in what context all the words specified by the user are usually used.
For each intent, there is a special template that extracts useful information from what the user has already said. This is called a semantic tagger. When asking questions, people rarely say all the information they need to answer, so the voice assistant must fill in the gaps on its own.
For example, to provide information about the weather in a specific city to a person (user), the assistant can ask the user the necessary clarifying questions or get all the necessary information itself if the geolocation option is enabled on the device, which is an important advantage of the system.
A definite advantage is that the request, even if it does not correspond to any of the scenarios, is not ignored, but should be redirected to the search or to the informal conversation module. In Alice, for example, this module is called "Chat". Voice assistants are very often used not to teach or do something specific, but to play: ask the assistant what books it likes or what it wears.
This task is solved with the help of standard editorial answers, because the developers of the voice assistant select hundreds of the most popular questions among users, and for each of them write several answer options. At the same time, all answers must be written in a single style so that they form a holistic idea of a specific assistant.
To answer vague questions and unclassifiable sentences, voice assistants typically use simple neural networks trained on texts from media, books, and movies. Thus, for example, Alice learns from a variety of materials in which characters, among other things, swear and argue.
Based on this, voice assistants are capable of learning something fundamentally different from the specifics that the developers directly offer. In situations where the voice assistant itself does not know about the existence of certain expressions, it will respond to them completely thoughtlessly, with purely random phrases, because the specified words will remain unknown to it.
Of course, at first glance, the advantages of voice assistants are quite obvious. Nowadays, people want to immediately receive information in the Internet space. The rhythm and special dynamics of life do not allow spending much time on text interfaces, and reference services of financial institutions, government agencies, as well as the most popular telephone services in modern realities are overflowing with user requests. In this context, voice assistants are able to save the situation in some way and solve certain problems in the current situation.
Discussions of voice and conversational interfaces typically span multiple systems, from bots that answer simple questions or make random jokes to complex systems used at the industrial level.
For example, a surprisingly good voice interface system is now available in terms of filing tax returns in England. However, the key players in this area are still Amazon's Alexa, Apple's Siri, Google's OK, and Microsoft's Cortana. Alibaba is also in demand, being a very well-thought-out assistant for Chinese users.
Another thing to note is that companies that have the resources, the knowledge, and the skills to make a significant step forward and make a huge difference in the development of voice assistants are, surprisingly, not at all interested in making such a step. Voice interfaces and voice assistants are innovations that are fundamentally changing the current state of affairs.[8]
For example, Google makes money on advertising. If instead of links next to which advertising is displayed, the user starts to receive a ready-made answer, a reasonable question arises about what to do with advertising. Or, Siri, which is an excellent assistant for increasing iPhone sales. It does its own highly specialized work, and for a well-known company, it does not make sense at the moment to do something new that will transform the already established ecosystem of the App Store.
User expectation assumes that in the foreseeable future, humans will be able to naturally express their desires, and the system will fully understand them. For this reason, the system will necessarily have to adapt to the human, and not vice versa. Thus, the creation of the most modern and promising voice assistants should most likely be considered from the point of view of understanding universal human characteristics. [9]
However, one cannot ignore the dangers and threats that modern technology in the form of voice assistants poses. The widespread use of voice assistants and the development of the Internet of Things raise the issue of security for all those who actively use these modern technologies. [10]
Many owners of voice assistants and home smart speakers that easily recognize commands are quite seriously concerned about the amount of information these devices receive when recording conversations. Despite the fact that the encrypted speech itself is usually stored on the developer's servers, the microphone can be turned off, and any recordings can be easily deleted manually, the technology in question is still very, very far from perfect and what to expect from it is not always clear and understandable.
This opinion is shared by various experts, who say that in the future, technical means will be developed in such a way that they will easily be able to identify the voice of a specific person and keep a list of those who have access to the device.
They are also concerned that smart devices (voice assistants) can often easily fall into the hands of children, who, unknowingly, can make very large purchases, thereby creating problems for their parents. These incidents have become so common that many large retailers have launched programs to refund money for goods ordered by small children. For example, such a situation occurred in Dallas, Texas (USA), when a six-year-old child asked the smart speaker "Amazon Echo" for a dollhouse and a couple of pounds of sweet cookies. The assistant, which does not distinguish between voices and responses, named "Alex", very quickly fulfilled the child's request, purchasing one of the more expensive models.
Some companies also noticed the problems that had arisen and immediately took advantage of them. They began to use various vulnerabilities in voice assistants that do not recognize the voice characteristics of the owner for their own selfish purposes. For example, one of the companies launched an advertising video in which a search query is pronounced that activates the Google Home speaker system. Thus, the “smart” speakers of Internet users were triggered and, despite the lack of desire of their owners, opened an article in Wikipedia that is dedicated to the products of the restaurant specified in the query. It should also be noted that Google eliminated the consequences of the aggressive advertising campaign, and now the speakers do not respond to such maneuvers. Nevertheless, a very high risk of repeating viral campaigns in the future is not excluded.
Accordingly, voice assistants are now characterized by many users as a real nightmare for privacy, since such systems process more and more information from the daily life of each user every year. However, if such user attitudes have affected the sales rating, then very little. Modern voice users are breaking all popularity records, many users consider their presence not only convenient, but also promising, very prestigious. [11]
Today, the implementation of systems based on communication with a visual interface is being actively developed. A huge advantage of visual interfaces is that the interaction options are visible. In a voice interface, the user does not know what exactly is available. Interaction with screens is a very well-developed topic. The screen will remain even if the voice interface works well, at least because a person has eyes and visual perception is the main format, and the voice is an auxiliary nuance. The human voice is able to interact with the display of data on the screen, without being a subordinate structure. For example, in "Alexa" the main component today is represented by the voice. The user can install the application to see all the system's responses on the screen if some responses are difficult to hear. However, the concept is changing now, so the next version of "Amazon Echo" is supposed to use an ultra-modern screen.
There are other problems directly related to voice assistants today. They can store more information than was previously planned. Assistants should record the sound track only after they hear a code signal from the owner. However, often the triggering occurs with consonant words or from a working TV, music player, simple conversational speech used in everyday life.
Among other things, employees at development companies may well have access to the personal, private information of any user.
This is due to the fact that people almost “completely” trust new information technologies on the Internet, do not think about personal information security and, as a result, often do not check the quality of the work of voice assistants, and they, in turn, are capable of detecting and transmitting purely personal information, including medical history.
According to experts, The Guardian has made special changes to its own quality control program for the voice assistant Siri. According to these new rules, employees will no longer be able to hear any voice commands sent by Siri users without the consent of the user.
And such concerns and changes are not in vain. In modern realities, attackers can quite easily use the user's personal data. Like any other information collected by companies, voice recordings are subject to the risk of hacker attacks that are actively used in modern conditions. They can be used to imitate the user's voice and hack his accounts, seemingly maximally reliably protected by biometric data. In some cases, there may not be an urgent need for such attacks. For example, there is a known case when an Amazon user, by pure chance, received more than one and a half thousand audio recordings of a complete stranger after requesting a file with his data.
It is also an undeniable fact that in modern conditions various conflicts of interest may well arise. Thus, companies collect personal data of users in order to solve clients' problems as best as possible. However, absolutely any collected personal information can be used by companies not only for themselves, but also for the benefit of certain partners. [12]
According to experts, some employees of large companies developing voice assistants, knowing certain codes and the technology of the systems, are able to find out where calls to voice assistants were made from, and in the shortest possible time calculate the home address of such a user and other necessary (available) information.
It is expected that in the very near future the current error rate should decrease by an order of magnitude thanks to the latest models for machine learning. It is quite possible that in the foreseeable future each user will have his own personal voice assistant with the voice he needs.
The introduction and distribution of the latest voice interfaces is happening extremely quickly, and soon, probably, it will be possible to see wonderful personalizations that are completely inaccessible today in the conditions of text search. At the same time, I also consider it necessary to note that despite the fact that voice assistants (interfaces) are quite well developed today, the technology has not yet reached its limit. In the coming years, it will develop in different directions. New voice assistants will soon enough find their "own face", which will expand their potential capabilities.
In conclusion, it should be noted that, as the practice of recent years shows, many domestic and foreign IT companies developing voice assistants have already entered the so-called "arms race". Undoubtedly, this struggle will allow the winners to obtain an amazing set of information, which can subsequently become an endless source for ensuring not only a competitive advantage, but also a very solid income.
And in this regard, I consider it appropriate to say: “Think about your personal information security!”
Literature:
Source
Keywords: voice assistant, Internet, information security, user, arms race, sound, artificial intelligence, human speech.
For decades, humanity has dreamed of a voice interface, as described in science fiction. And now, thanks to new information technologies and the Internet, virtual voice assistants have appeared and are gaining popularity among users around the world.
For many companies, voice assistants have become not just a point of contact with consumers, but an additional new channel of communication. It becomes possible to conduct a necessary (verbal) dialogue with a person (user), which helps to obtain additional information about him, as well as to create a new experience of interaction. [1]
The relevance of the issue under study is that the integration of voice assistants is actively being introduced into various types of human or business activities and is gradually becoming an integral part of highly effective and cutting-edge interactive marketing communications. [2]
The history of voice assistants began in the late 1930s, when, according to experts, scientists first attempted to recognize a person’s voice. The Bell voice number recognition system was first announced. Some time later, the world was shown a new tool (system) called Shoebox, which IBM presented at the World’s Fair in Seattle (USA) in 1962. The tool was capable of performing mathematical functions, and among other things, identifying sixteen spoken words and numbers from 1 to 9.
The next step was the "Harpy" solution (system), developed by scientists from Carnegie Mellon University in Pittsburgh, Pennsylvania, USA in the 1970s, which has already recognized more than a thousand words, i.e. the approximate vocabulary of a three-year-old child.
As technologies emerged that recognized word sequences, companies began developing applications for them. In the 1990s, companies like IBM, Apple, and others developed solutions that used voice recognition: in 1993, Apple released the Macintosh with PlainTalk technology, and in April 1997, Dragon presented a solution that could convert up to a hundred words per request per minute.
Further developments took place in the development of smart home solutions: in November 2014, Amazon presented the smart speaker Alexa, two years later, in November 2016, Google released the Google Home system, and in February 2018, Apple entered this market again.
Currently, the most comprehensive product solution in the field of real research has become the new information technology "voice assistant", since the described solution includes the use of all existing voice technologies. They include voice recognition systems, speech analysis and processing systems, text-to-speech systems and voice biometrics. [3]
At the same time, according to experts, this "voice assistant" technology has become generally available today and can be written in almost any programming language. However, the most popular in 2020 was the high-level programming language "Python". The subsystem is activated by entering a voice command. The received signal is converted into digital form and filtered by external noise. Among other things, the converted signal is sent to the identification subsystem.
In this subsystem, a query is first made to the signal database to recognize the command. If the input and stored signals match, the detection is considered successful and the command is transmitted to the executing device, which performs a certain action. If the voice command is not recognized, the system returns to the beginning - input of the voice command, and the algorithm of actions is repeated again until a positive result is obtained, that is, until the voice command is recognized. [4]
Thus, we can say that a voice assistant is a modern service based on artificial intelligence that recognizes human speech. Such assistants have every chance of performing various actions in response to voice commands.
Voice assistants are most often used in smartphones, smart speakers, and, among other things, in some modern browsers. Currently, there are several common voice assistants, each of which has its own strengths and weaknesses. [5]
For example, home voice assistants are small speakers that can be installed in any convenient place for the user. For mobile devices, special applications are used that need to be downloaded to the device.
The first place is occupied by the voice assistant "Alice", supported by the program "Yandex Alice". Initially, the system (speaker) has Russian language installed, so it will be convenient to use the device. The device is suitable for users of "iOS", "Android" and can be used to monitor the weather, play music and quickly find the answers you need. The program quickly searches for answers on the Internet and offers the most suitable option.
Has positive aspects:
- Suitable for large spaces;
- Designed specifically for domestic users;
- Supports Yandex functionality.
It also has some negative aspects:
- The assistant does not always provide clear answers;
- Users often receive humorous answers instead of a precise answer.
Another voice assistant, less popular, is the Google Nest Mini (2nd gen), which is an affordable speaker used in the car and at home. The Google Assistant is already installed on the purchased device, connected via Wi-Fi or Bluetooth. The device can be mounted on the wall, and thanks to three speakers, users can get the information they need anywhere in the house. The device allows you to receive the most relevant information and listen to music.
The positive aspects are: 1. Clarity of sound; 2. Quick formation of answers to the questions asked.
Disadvantages: 1. Need for language settings; 2. English language pack is pre-installed.
Third place goes to the assistant (voice assistant) "Apple Siri", installed in special speakers or on another "Apple" device. For home use, it is advisable to use a speaker equipped with sensitive speakers that quickly recognize commands. It can also be used in a car. The program can recognize almost all types of languages. If desired, it can answer questions with a male or female voice. This system is widely used in cell phones. It works quickly on "iOS: iPhone", "Apple TV", "iPad" and "Apple Watch" devices. You can use the voice assistant to manage calls, music and applications.
It has positive aspects: 1. Clarity of sound; 2. Quick command recognition; 3. Ease of control.
Cons: Can only be used with Apple devices.
Thus, it can be said that the new information technology “voice assistants”, created specifically for personal computers, help people (users) search for the necessary information.
And in this case, we can say that, according to experts, the first place in this cluster is occupied by a home voice assistant called "Gorynych", which simplifies the use of a computer.
Using voice commands, it is possible to control applications and the mouse, and the dictionary size allows you to clearly recognize commands. If desired, the dictionary can be periodically replenished with new phrases. You can download the assistant in absolutely any browser.
In turn, it is necessary to indicate that the assistant takes up a small amount of memory, for this reason it will not affect the speed of the personal computer.
Positive aspects: 1. Quick search of requested information; 2. Possibility of text input; 3. Launching applications present on the personal computer.
Negative aspects: no disadvantages have been identified by users or specialists.
The second place is taken by the assistant (voice assistant) "Cortana", provided for the "Microsoft" system. It is supported by such systems as "Android", "Xbox One", "Microsoft Phone" and "Microsoft Band". Therefore, it can be installed on tablets. With the help of "Cortana", you can plan your day, find out the route or find the required data, as well as read emails and find the desired music. Among other things, the assistant can open the desired application if the user's hands are busy.
Pros: 1. Easy to install. 2. Applies to computers and phones.
Disadvantages: 1. Users and specialists have not identified any disadvantages.
In third place is a relatively new domestic program (voice assistant) "Marusya", which is just beginning to gain popularity. The assistant can be installed on personal computers, tablets or smartphones. The program was released by "Mail.ru Group". With the help of the assistant, users have every chance of quickly finding the necessary information on the Internet. Among other things, the assistant will notify you of an important date and plan your day down to the minute.
Pros: 1. High speed of request processing; 2. Applies to computers and phones; 3. Extremely easy to install on a device.
Disadvantages: can only be used if you have Russian citizenship.
Along with the information presented, I would like to note that voice assistants are the most frequently used smartphones.
Through applications, users have every chance to open additional tabs and use their mobile device in speakerphone mode.
In first place is the rather popular program "Google Assistant". The most popular system for smartphones can recognize up to three dozen languages, supports such systems as browsers "Android", "iOS" and "Chrome". To use it, you need to install and activate the program with the phrase "Okay, Google". Thanks to the assistant, the necessary route is laid out, you can find out the weather or call if the user's hands are busy. You can write a message, find and open the desired application.
Positive aspects: 1. High speed of work; 2. Quick search of necessary information.
Cons: Generates too brief responses to queries.
In second place is the modern voice assistant "Dusya", which can be activated by voice or by waving a hand. The user can create tasks in advance and independently perform them in the required mode. The assistant will promptly remind you of important matters and send a message. Also, with the help of a smart assistant, you can find out the information you are interested in.
Positive aspects: 1. Easy to operate; 2. Possibility to independently create functions; 3. Possibility to work without a voice command.
Cons: There is no free format for use.
Third place goes to the assistant "Amazon Alexa". This application should be classified as universal and is often installed on a mobile phone. At the same time, the voice assistant is also very popular in home management. The application allows you to write letters, set your favorite music, search for information on the Internet, etc.
For home use, you need to buy a special device and with its help you can open blinds, turn on lights, music and open doors. After installation, the device is activated and responds exclusively to certain voices.
Pros: 1. Universal use; 2. Functions quickly; 3. Can open applications.
Negative aspects: no disadvantages have been identified by users or specialists.
The voice assistants listed above are extremely popular.
However, one cannot fail to mention special voice assistants from domestic developers:
- "Marusya" is a voice assistant developed by "Mail.ru Group". Launched on June 17, 2019 in test mode. "Marusya" is available on the "iOS" and "Android" platforms as a separate application. (some information was presented above).
- "Oleg" is a virtual voice assistant in the field of finance and lifestyle services, developed by the Tinkoff Group. It works in the Tinkoff mobile application. You can communicate with it by voice or using a mobile keyboard.
- "Grigory" is a new product from "Beru.ru", which calls all clients of the marketplace, including the pilot segment of users.
- "Alexandra" is a modern metro bot that successfully functions in the Moscow Metro mobile application, messengers and metro social networks, and uses artificial intelligence and machine learning.
- "Elena" is a virtual operator of the Megafon support service, ready to consult subscribers on most questions that arise, able to work with users in both voice and text formats.
- "Marvin" is a voice assistant from MTS, which can tell you the weather, read a book or a fairy tale, play music, read the latest news, make a to-do list, and control a "Smart Home".
- Chatbot "Vasya" is a migration assistant of the Main Directorate for Migration Issues of the Ministry of Internal Affairs of Russia.
Thus, it is necessary to note that these assistants are actively entering our lives and every day more and more people use them in everyday life.
But… when choosing a voice assistant, you need to consider the functions that are important to the user. Each voice assistant has its own individual functionality, which is systematically updated. When choosing the right intelligent system, it is very important to consider individual criteria. [6]
Experts have tested some systems, such as Google, Alice, Siri, software from Microsoft and Amazon, which were displayed on the first page of search engines. In their opinion, Google Assistant and Yandex Alice became especially popular. These voice assistants gave more correct answers to experts' questions. Meanwhile, experts say that not all voice assistants are ideal and thus need to be improved.
In this regard, the integration of voice assistants into various devices in the future may lead to the fact that marketers will need to adjust their approach to communicating with users, making it more personal, but this problem can only be solved if companies find a way to effectively protect voice assistants from fraud. [7]
However, even now, the modern voice assistants that have been created have made it possible to significantly reduce the time spent on performing simple, routine daily tasks, and this has been shown by the studies presented above.
At the same time, the functionality that voice assistants have is quite extensive, so such services provide communication with the user, searching for information on the Internet and short answers to user requests, calling a taxi, calling and writing messages, turning on music and working the alarm clock, creating a route with a search for the required objects along the way. Also, voice assistants always take into account the user's location, time component and day of the week, and among other things, the service is carried out taking into account the history of all previous user requests.
As an example, one can imagine individual sectors of the economy where they have found application.
Voice assistants are currently present in various fields and industries, and they are often integrated in the most non-standard formats. For example, Zyrtec is a well-known brand of anti-allergy products that hired a voice assistant that tells users useful information about risk factors for people suffering from allergic reactions. Virtual bots from Tide can teach how to remove stains, and a feature of the digital assistants of the job search service on the hh.ru website is the selection of job offers or the provision of information on the average salary. At the same time, Nike has also decided to make life easier for customers, so you can tie your shoelaces using voice control using an iPhone or smartwatch.
In addition to purely “image” solutions and their implementation, there are currently industry areas in which voice technologies have a direct impact on the level of efficiency and the performance of very specific tasks.
For example, from the retail point of view, online purchases are made through voice assistants, and North American retailers use voice technologies for the purpose of fully automating trade processes. At the same time, it is no secret that the popularity of voice technologies in the field of online shopping is due to the increase in customer comfort: a search engine can easily answer the question: "Where to buy a coat at the most affordable price?" or competently advise on the choice of goods.
On a global scale, the most advanced voice assistants are gradually becoming an integral part of the team (company) in those niches of the retail trade network that are characterized by rapid reuse of customer orders.
Among the trendsetters at present, we can note well-known leaders in their industries, such as Amazon and McDonald's.
Along with this, it can be noted that voice assistants are a very powerful tool for obtaining data, monitoring and improving the quality of service, optimizing processes and monitoring compliance with corporate standards. For example, the development and implementation of the Russian Post robot based on the Yandex SpeechKit technology of the Yandex.Cloud cloud platform has shown the effectiveness of its work in such areas as accepting an appeal, determining the root cause of an incident, as well as standard registration and assignment of an appeal number when sending any user request to the enforcement stage.
In the banking sector, voice requests can be used almost in the same way as in retail, i.e. at the level of recurring orders, including microloans. Voice assistants have already proven their high efficiency in the field of medicine. Voice interfaces are becoming personal doctors with vast experience: their “memory card” is practically inexhaustible.
Smart assistants will not only competently consult the user, but also professionally prescribe a treatment plan. At the same time, robots rarely forget or lose sight of anything while processing tests. One of the brightest examples of this approach today is the medical system "Triad Health AI", which uses "Google Home" and "Amazon Alexa" in its work to treat Parkinson's disease.
Along with this, it can be said that it is also obvious that highly specialized niches, including the sale of engineering equipment, will remain on the so-called technological periphery for a long time. Such areas have not yet entered the mass format of online sales, not to mention the use of voice assistants.
Thus, it is necessary to note that voice assistants in the future will actively develop, modernize and improve, will find new areas of application, but one important aspect will develop more and more actively, i.e. as soon as the voice assistant recognizes speech and translates it into text format, it must understand what kind of answer the user expects from it.
For example, in Alice, client queries are initially sorted by intent, and only then are they redirected to thematic sections, including listening to music files or informal conversation. Thus, the key task of the intent classifier is to determine what exactly the user meant by his phrase. In the intent classifier, the query is divided into words and punctuation marks. For these, embeddings trained on very big data or special word representations are used, allowing us to understand in what context all the words specified by the user are usually used.
For each intent, there is a special template that extracts useful information from what the user has already said. This is called a semantic tagger. When asking questions, people rarely say all the information they need to answer, so the voice assistant must fill in the gaps on its own.
For example, to provide information about the weather in a specific city to a person (user), the assistant can ask the user the necessary clarifying questions or get all the necessary information itself if the geolocation option is enabled on the device, which is an important advantage of the system.
A definite advantage is that the request, even if it does not correspond to any of the scenarios, is not ignored, but should be redirected to the search or to the informal conversation module. In Alice, for example, this module is called "Chat". Voice assistants are very often used not to teach or do something specific, but to play: ask the assistant what books it likes or what it wears.
This task is solved with the help of standard editorial answers, because the developers of the voice assistant select hundreds of the most popular questions among users, and for each of them write several answer options. At the same time, all answers must be written in a single style so that they form a holistic idea of a specific assistant.
To answer vague questions and unclassifiable sentences, voice assistants typically use simple neural networks trained on texts from media, books, and movies. Thus, for example, Alice learns from a variety of materials in which characters, among other things, swear and argue.
Based on this, voice assistants are capable of learning something fundamentally different from the specifics that the developers directly offer. In situations where the voice assistant itself does not know about the existence of certain expressions, it will respond to them completely thoughtlessly, with purely random phrases, because the specified words will remain unknown to it.
Of course, at first glance, the advantages of voice assistants are quite obvious. Nowadays, people want to immediately receive information in the Internet space. The rhythm and special dynamics of life do not allow spending much time on text interfaces, and reference services of financial institutions, government agencies, as well as the most popular telephone services in modern realities are overflowing with user requests. In this context, voice assistants are able to save the situation in some way and solve certain problems in the current situation.
Discussions of voice and conversational interfaces typically span multiple systems, from bots that answer simple questions or make random jokes to complex systems used at the industrial level.
For example, a surprisingly good voice interface system is now available in terms of filing tax returns in England. However, the key players in this area are still Amazon's Alexa, Apple's Siri, Google's OK, and Microsoft's Cortana. Alibaba is also in demand, being a very well-thought-out assistant for Chinese users.
Another thing to note is that companies that have the resources, the knowledge, and the skills to make a significant step forward and make a huge difference in the development of voice assistants are, surprisingly, not at all interested in making such a step. Voice interfaces and voice assistants are innovations that are fundamentally changing the current state of affairs.[8]
For example, Google makes money on advertising. If instead of links next to which advertising is displayed, the user starts to receive a ready-made answer, a reasonable question arises about what to do with advertising. Or, Siri, which is an excellent assistant for increasing iPhone sales. It does its own highly specialized work, and for a well-known company, it does not make sense at the moment to do something new that will transform the already established ecosystem of the App Store.
User expectation assumes that in the foreseeable future, humans will be able to naturally express their desires, and the system will fully understand them. For this reason, the system will necessarily have to adapt to the human, and not vice versa. Thus, the creation of the most modern and promising voice assistants should most likely be considered from the point of view of understanding universal human characteristics. [9]
However, one cannot ignore the dangers and threats that modern technology in the form of voice assistants poses. The widespread use of voice assistants and the development of the Internet of Things raise the issue of security for all those who actively use these modern technologies. [10]
Many owners of voice assistants and home smart speakers that easily recognize commands are quite seriously concerned about the amount of information these devices receive when recording conversations. Despite the fact that the encrypted speech itself is usually stored on the developer's servers, the microphone can be turned off, and any recordings can be easily deleted manually, the technology in question is still very, very far from perfect and what to expect from it is not always clear and understandable.
This opinion is shared by various experts, who say that in the future, technical means will be developed in such a way that they will easily be able to identify the voice of a specific person and keep a list of those who have access to the device.
They are also concerned that smart devices (voice assistants) can often easily fall into the hands of children, who, unknowingly, can make very large purchases, thereby creating problems for their parents. These incidents have become so common that many large retailers have launched programs to refund money for goods ordered by small children. For example, such a situation occurred in Dallas, Texas (USA), when a six-year-old child asked the smart speaker "Amazon Echo" for a dollhouse and a couple of pounds of sweet cookies. The assistant, which does not distinguish between voices and responses, named "Alex", very quickly fulfilled the child's request, purchasing one of the more expensive models.
Some companies also noticed the problems that had arisen and immediately took advantage of them. They began to use various vulnerabilities in voice assistants that do not recognize the voice characteristics of the owner for their own selfish purposes. For example, one of the companies launched an advertising video in which a search query is pronounced that activates the Google Home speaker system. Thus, the “smart” speakers of Internet users were triggered and, despite the lack of desire of their owners, opened an article in Wikipedia that is dedicated to the products of the restaurant specified in the query. It should also be noted that Google eliminated the consequences of the aggressive advertising campaign, and now the speakers do not respond to such maneuvers. Nevertheless, a very high risk of repeating viral campaigns in the future is not excluded.
Accordingly, voice assistants are now characterized by many users as a real nightmare for privacy, since such systems process more and more information from the daily life of each user every year. However, if such user attitudes have affected the sales rating, then very little. Modern voice users are breaking all popularity records, many users consider their presence not only convenient, but also promising, very prestigious. [11]
Today, the implementation of systems based on communication with a visual interface is being actively developed. A huge advantage of visual interfaces is that the interaction options are visible. In a voice interface, the user does not know what exactly is available. Interaction with screens is a very well-developed topic. The screen will remain even if the voice interface works well, at least because a person has eyes and visual perception is the main format, and the voice is an auxiliary nuance. The human voice is able to interact with the display of data on the screen, without being a subordinate structure. For example, in "Alexa" the main component today is represented by the voice. The user can install the application to see all the system's responses on the screen if some responses are difficult to hear. However, the concept is changing now, so the next version of "Amazon Echo" is supposed to use an ultra-modern screen.
There are other problems directly related to voice assistants today. They can store more information than was previously planned. Assistants should record the sound track only after they hear a code signal from the owner. However, often the triggering occurs with consonant words or from a working TV, music player, simple conversational speech used in everyday life.
Among other things, employees at development companies may well have access to the personal, private information of any user.
This is due to the fact that people almost “completely” trust new information technologies on the Internet, do not think about personal information security and, as a result, often do not check the quality of the work of voice assistants, and they, in turn, are capable of detecting and transmitting purely personal information, including medical history.
According to experts, The Guardian has made special changes to its own quality control program for the voice assistant Siri. According to these new rules, employees will no longer be able to hear any voice commands sent by Siri users without the consent of the user.
And such concerns and changes are not in vain. In modern realities, attackers can quite easily use the user's personal data. Like any other information collected by companies, voice recordings are subject to the risk of hacker attacks that are actively used in modern conditions. They can be used to imitate the user's voice and hack his accounts, seemingly maximally reliably protected by biometric data. In some cases, there may not be an urgent need for such attacks. For example, there is a known case when an Amazon user, by pure chance, received more than one and a half thousand audio recordings of a complete stranger after requesting a file with his data.
It is also an undeniable fact that in modern conditions various conflicts of interest may well arise. Thus, companies collect personal data of users in order to solve clients' problems as best as possible. However, absolutely any collected personal information can be used by companies not only for themselves, but also for the benefit of certain partners. [12]
According to experts, some employees of large companies developing voice assistants, knowing certain codes and the technology of the systems, are able to find out where calls to voice assistants were made from, and in the shortest possible time calculate the home address of such a user and other necessary (available) information.
It is expected that in the very near future the current error rate should decrease by an order of magnitude thanks to the latest models for machine learning. It is quite possible that in the foreseeable future each user will have his own personal voice assistant with the voice he needs.
The introduction and distribution of the latest voice interfaces is happening extremely quickly, and soon, probably, it will be possible to see wonderful personalizations that are completely inaccessible today in the conditions of text search. At the same time, I also consider it necessary to note that despite the fact that voice assistants (interfaces) are quite well developed today, the technology has not yet reached its limit. In the coming years, it will develop in different directions. New voice assistants will soon enough find their "own face", which will expand their potential capabilities.
In conclusion, it should be noted that, as the practice of recent years shows, many domestic and foreign IT companies developing voice assistants have already entered the so-called "arms race". Undoubtedly, this struggle will allow the winners to obtain an amazing set of information, which can subsequently become an endless source for ensuring not only a competitive advantage, but also a very solid income.
And in this regard, I consider it appropriate to say: “Think about your personal information security!”
Literature:
- All-Russian Congress of Young Scientists (Saint Petersburg). Collection of Works of the VIII Congress of Young Scientists / Ministry of Science and Higher Education of the Russian Federation, ITMO University. — Saint Petersburg: ITMO University, 2019. Vol. 3. — 2019. — 371 p.
- Step into the Future: Artificial Intelligence and the Digital Economy: Smart Nations: Economy of Digital Equality: Proceedings of the III International Scientific Forum / Ministry of Science and Higher Education of the Russian Federation, Federal State Budgetary Educational Institution of Higher Education "State University of Management"; edited by P. V. Terelyansky, S. M. Malkarova. - Moscow: GUU, 2020-. Issue 1. - 2020. 360 p.
- Economy. Law. Innovations. 2020. 2020, No. 4. — 2020. — 106 p.
- Ponachugin A. V., Pichuzhkina D. Yu., Smekalova E. S. "Voice assistant as a data processing technology" Science without borders. 2020. No. 6 (46). [electronic portal] URL: https://cyberleninka.ru/article/n/golosovoy-pomoschnik-kak-tehnologiya-obrabotki-dannyh (date of access: 03/17/2021).
- Personnel, Social and Business Communications Management: Methods, Models, Technologies: Proceedings of the All-Russian Scientific and Practical Conference / Ministry of Science and Higher Education of the Russian Federation, Federal State Budgetary Educational Institution of Higher Education "State University of Management"; [editorial board: Ekimova K. V. [et al.]. - Moscow: Publishing House of the State University of Management, 2019. - 171 p.
- Marr, Bernard. Artificial Intelligence in Practice: 50 Cases of Successful Companies: [16+] / Bernard Marr, Matt Ward; translated from English by Ekaterina Petrova. - Moscow: Mann, Ivanov and Ferber, 2020. - 316 p.
- Akhmaeva L. G. User experience and possibilities of using voice assistants in interactive marketing communications: Amazon Alexa, Google home, Apple Siri, Yandex Alice // Bulletin of the State University of Management. 2020. No. 5. [electronic portal] URL: https://cyberleninka.ru/article/n/p...yh-marketingovyh-kommunikatsiyah-amazon-alexa (date of access: 03/17/2021).
- Voronezh State University. Series: Systems analysis and information technologies: scientific journal / founder and publisher: Federal State Budgetary Educational Institution of Higher Education "Voronezh State University". - Voronezh: Voronezh State University, 2006-.2020, No. 1. - 2020. - 178 p.
- Technological trends and models of digital transformation of the economy: monograph / Malyavkina L. I., Savina A. G., Sergeeva I. I. [et al.]; edited by Doctor of Economics, Professor Malyavkina L. I.; Ministry of Higher Education and Science of the Russian Federation, Oryol State University of Economics and Trade. - Oryol: OryolSUET, 2020. - 167 p.
- Information and Security. 2020. 2020, Vol. 23, No. 3. — 2020. — [2], 324–469 p.
- Almanac of scientific works of young scientists / Ministry of Science and Higher Education of the Russian Federation, ITMO University. - St. Petersburg: ITMO University, 2019-. V. 2. - 2019. - 174 p.
- Korotkikh, Tatyana Nikolaevna. Modern information technologies: a tutorial on the course "Modern problems of computer science and computer engineering" for students studying in the areas of 09.04.01 - "Computer science and computer engineering" / T. N. Korotkikh, I. I. Korotkikh; Ministry of Science and Higher Education of the Russian Federation, National Research University "MPEI". Moscow: MPEI Publishing House, 2020. - 58 p.
Source