• Starter AI
  • Posts
  • DeepMind’s TransNAR, Camb AI introduces Mars5, Nvidia’s new model

DeepMind’s TransNAR, Camb AI introduces Mars5, Nvidia’s new model

Translating and dubbing are getting easier.

Hello, Starters!

Not a day goes by without discovering new tools and models that will shake up the AI landscape. It can be hard to keep track of all of them, but let us guide you through it all!

Here’s what you’ll find today:

  • TransNAR: Combining approaches

  • Camb AI’s speech and translation models

  • Nvidia presents Nemotron-4 340B

  • McDonald’s stops its AI drive-thru project

  • Google’s “Personal Health Large Language Model”

  • And more.

Google DeepMind researchers have presented TransNAR, a groundbreaking AI architecture mix. TransNAR merges transformers with Neural Algorithmic Reasoners (NARs), resulting in a language model that excels at algorithmic reasoning and handling complex tasks as it combines the strengths of both methods.

Transformers have limitations: they often fail on tasks that require precise algorithmic calculations, while NARs cannot be utilised for unstructured problems in natural language. TransNAR removes these weaknesses, as one approach focuses on language, and the other on graph representation, outperforming pure transformer models.

As days go by, more AI tools emerge, and in the voice cloning field, ElevenLabs continues to reign. However, it may not be for long, as Camb AI has entered the scene. They have recently open-sourced Mars5, a voice cloning AI model with realistic results that captures nuances in speech such as emotion, rhythm, and intonation while allowing users to dub content into over 140 languages.

The company has already struck deals with sports leagues such as the Australian Open and Major League Soccer and is currently working on Boli, a translation model that surpasses Google Translate in understanding context and colloquialisms.

It has been widely stated that data for training is one of the main needs for further improvement in AI development, and it seems like Nvidia is finding a solution to this issue. They have released "Nemotron-4 340B," a family of open models designed to create high-quality synthetic data.

Nemotron-4 340B was mysteriously rolled out at the LMSys.org Chat Bot Arena as "june-chatbot." With training on 9 million tokens, a 4,000 context window, and a commercially viable licence, this model has the potential to be a game changer, as it will allow businesses to craft their own fine-tuned LLMs cost-effectively without the need for real-world datasets.

🍟After two years of testing, McDonald's has decided to end its automated order-taking project, which was part of a partnership with IBM. This AI project aimed to optimise operations and speed up their drive-thru services. However, it is now being removed from more than 100 restaurants in the US. The fast-food company claims they are still looking forward to implementing it in the future.

🏥Google has unveiled the "Personal Health LLM," a fine-tuned version of Gemini designed for wearables. This LLM understands the data obtained from smartwatches and heart rate monitors, allowing it to answer queries and make predictions accurately, outperforming professionals in the field.

What did you think of today's newsletter?

Login or Subscribe to participate in polls.

Thank you for reading!