I know that my article can help a designer who spends a lot of time working and not feeling his or her growth. This article has some tips on how to start building your soft skills.
In this article, we shall provide some background on how multilingual multi-speaker models work and test an Indic TTS model that supports 9 languages and 17 speakers (Hindi, Malayalam, Manipuri, Bengali, Rajasthani, Tamil, Telugu, Gujarati, Kannada).
It seems a bit counter-intuitive at first that one model can support so many languages and speakers provided that each Indic language has its own alphabet, but we shall see how it was implemented.
Also, we shall list the specs of these models like supported sampling rates and try something cool – making speakers of different Indic languages speak Hindi. Please, if you are a native speaker of any of these languages, share your opinion on how these voices sound, both in their respective language and in Hindi.
In our last article we made a bunch of promises about our speech synthesis.
After a lot of hard work we finally have delivered upon these promises:
- Model size reduced 2x;
- New models are 10x faster;
- We added flags to control stress;
- Now the models can make proper pauses;
- High quality voice added (and unlimited "random" voices);
- All speakers squeezed into the same model;
- Input length limitations lifted, now models can work with paragraphs of text;
- Pauses, speed and pitch can be controlled via SSML;
- Sampling rates of 8, 24 or 48 kHz are supported;
- Models are much more stable — they do not omit words anymore;
This is a truly break-through achievement for us and we are not planning to stop anytime soon. We will be adding as many languages as possible shortly (the CIS languages, English, European languages, Hindic languages). Also we are still planning to make our models additional 2-5x faster.
We are also planning to add phonemes and a new model for stress, as well as to reduce the minimum amount of audio required to train a high-quality voice to 5 — 15 minutes.
After Epic released the UE5 technology demo at the beginning of 2021, the discussion about UE5 has never stopped. Related technical discussions mainly centered on two new features: global illumination technology Lumen and extremely high model detail technology Nanite. There have been some articles [1 ] analyzing Nanite technology in more detail. This article mainly starts from the RenderDoc analysis and source code of UE5, combined with some existing technical data, aims to provide an intuitive and overview understanding of Nanite, and clarify its algorithm principles and design ideas, without involving too many source code level Implementation details.
Lumen is UE5’s GI system, it is different from the traditional real-time GI which only includes the contribution of indirect diffuse reflection. It also includes indirect diffuse reflection and indirect highlight, providing a new set of complete indirect lighting. Lumen supports both hardware-based RTX and software-based Trace algorithms. The starting point of this article is that Lumen GI uses the process, algorithm, and data structure analysis of indirect diffuse reflection part based on software Trace to understand the basic principle and operation mechanism of Lumen from a macro perspective.
The core of Lumen includes the following parts:
Today, we will share some knowledge points related to resource memory leak. A memory leak is the most common issue that we continuously see and also are afraid of. What is the reason behind it? Because we can’t predict the extent of the leak before we locate the leak bottleneck, we had no idea whether it will burst out at a certain moment on the line. We have received feedback from developers that their players had no problem playing for half an hour, but they would get more and more stuck after 3 to 4 hours of playing, which they never expected before. How can it be solved? Today’s sharing will answer such questions.
UWA’s GOT Online-Assets report has a resource occupancy trend chart. If there is a rising trend like the one below, you must pay special attention.
Access the power of hardware accelerated video codecs in your Windows applications via FFmpeg / libavcodec
This article continues the series of articles on load tests. Today we will analyze the testing methodology and answer the question: "How many IP cameras can be connected to a WebRTC server?"
Do you remember how just a few years ago it was a disaster to lose a camera at the end of a vacation? All memorable pictures and videos then disappeared along with the lost device. Probably, this fact prompted the great minds to invent cloud storage, so that the safety of records no longer depends on the presence of the devices on which these records are made.
We continue to review variants of load tests. In this article we will go over the testing methodology and conduct a load test that we will use to try and determine the number of users that could watch and stream at the same time, meaning the users will simultaneously publish and view the streams.
This article is a continuation of our series of write-ups about load tests for our server. We have already discussed how to compile metrics and how to use them to choose the equipment, and we also provided an overview of various load testing methods. Today we shall look at how the server handles stream mixing.
So, I finally found a moment to write a bit about how we created the water for TReload. Our basic goal was to flood all of the levels with acid - a lot of acid, as the flooded area is massive :) Here’s one of the results which we got out of this process:
In the previous article we went over a load test whose data could be used to choose a load-appropriate server. In the course of the testing, we would publish a stream on one WCS, and we would pick up that stream several times using a second WCS. The acquired results could be used as a basis for decisions on server operability.
Some would (justly) have concerns regarding the possible biases in such a test — after all, one of our servers was used to test another one of our servers. Could it be that we were using a specially optimized code that skewed the results in our favor?
In any project, a great deal of importance is placed on the selection of server hardware and WebRTC streaming is no exception. One of the key principles of such a selection is balance – the hardware should be powerful enough to handle the streams with no drops in quality, but not too powerful so as to waste resources. So, how does one choose the right server?
Monitoring systems are a vital tool for any system administrator, because they can be used to extract specific information from services, such that:
In the comment sections of our articles about our server there are often users who say: "Why would you jump through so many hoops, when you can do the same with a single line of code in FFmpeg!?"
In this article we will once again return to the tired topic of webinars and webinar hosting tools. And no, we're not about to code a whole new system for webinar hosting – there are already plenty of those. Instead, we will talk about connecting drawing software to the webinar, so that you could manually draw and broadcast the process.
PVS-Studio has a mascot that became inseparable from the brand - a unicorn. Lately we've been getting many questions about our magic steed: why the unicorn, why has he changed so much, does he have hooves, how come he doesn't wear pants, and how do we draw him. The answers are finally here, in this very article.
Attention: there will be a lot of pictures. And I mean A LOT.
The potential of VoIP to your customers is simply phenomenal. Businesses are experiencing the advantages of VoIP’s cost-efficiency and reliability and now you can pass these benefits onto your own customers very easily. Cloud telecommunication is sophisticated and easily integrated. Confidence in this technology is growing fast. There has never been a better time to start talking to your customers about adopting this solution. It will deliver huge business benefits for them and has the potential to increase business income and profitability.