[ad_1]
MADRID, November 13 (Portaltic / EP) –
Amazon announced the migration of most Alexa assistant workloads to the EC2 web service, powered by Inferentia’s proprietary chip, designed to accelerate deep learning workloads.
Inferentia is a chip designed to accelerate deep learning workloads that the company first introduced in 2018, based on Amazon EC2, a web service that simplifies the use of web-scale cloud computing.
Now, Amazon has announced on its official blog that it has applied it to Alexa, its cloud-based digital assistant, which has led to a 25% reduction in latency and 30 percent cost versus GPU-based instances of Alexa Text-to-Speech upload jobs.
Alexa’s “brain” is in the cloud, distributed on Amazon’s (AWS) servers. That’s where the process the interaction between the user and the assistant, so that the former immediately received an answer.
Specifically, the chips present in the device with Alexa detect the key command that wakes the assistant and activates the microphone to record the user’s request. This request, the sound becomes text (what is called automatic speech recognition) and is analyzed in the cloud, to understand what it means.
This second part of the analysis is what is called natural language understanding, as pointed out by Amazon, from which we get an Intent (what the user wants) and a set of associated parameters (for example, the postal code if l user asked about the weather). So, Alexa processes this and offers a response to the user.
For the answer, Alexa needs convert that text back into speech, which will be broadcast via the device (for example, an Echo speaker) for the user to hear and get their response.
Machine learning models intervene in the construction of the sentence with the response offered to the user and the natural sound. The last phase, moving from text to speech, uses models based on graphic processing units (GPU) to run the workloads, but the novelty announced by Amazon is that the Alexa team has migrated most of them to EC2 and Inferentia.
Source link