- Starter AI
- Posts
- Mistral’s Pixtral 12B, More on “Strawberry”, Apple’s visual AI approach
Mistral’s Pixtral 12B, More on “Strawberry”, Apple’s visual AI approach
These past few days have been busy.
Hello, Starters!
There's a lot of movement in the AI industry; we know that's the norm. But September is a month that promises a lot of advancements from both the open-source community and well-known giants. We'd better fasten our seatbelts!
Here’s what you’ll find today:
Mistral introduces Pixtral 12B
“Strawberry” is closer than we thought
Learn about Apple’s Visual Intelligence
Introducing: DeepSeek-V2.5
Google turns notes into “podcasts”
And more.
Ready to explore the tech world with insights from top experts? HockeyStick Show is your perfect podcast for valuable advice and fresh perspectives. Don't miss out—subscribe now!
🖼️ Mistral introduces Pixtral 12B (1 min)
Mistral has made its entrance into the multimodal world with Pixtral 12B. This model, built on Nemo 12B and clearly harnessing 12 billion parameters, can answer queries about images shared through links or encoded ones and perform a variety of tasks related to pictures, such as captioning or identifying how many objects are in a photo.
Pixtral 12B is currently available on GitHub and Hugging Face, and it can easily be fine-tuned by users, as it's been developed under an Apache 2.0 licence.
🍓 “Strawberry” is closer than we thought (2 min)
The fruit-related theme continues as we receive more news about OpenAI's upcoming "Strawberry" model. Recent claims from two insiders who have tried it suggest a launch date closer than initially expected, with the model possibly making headlines in two weeks.
Strawberry will be integrated into ChatGPT, and what makes it stand out is a feature that allows it to "think" for 10 to 20 seconds before answering queries. This brief pause is designed to enhance its logical reasoning capabilities in complex areas like maths, programming, design, and new tasks.
🍎 Learn about Apple’s Visual Intelligence (1 min)
With the official announcement of the iPhone 16, we can finally say we're getting closer to Apple's full-on AI era. However, the excitement must wait until October, when iOS 18 becomes available with all the 'Apple Intelligence' features. One standout is Visual Intelligence, which resembles a concept we've already seen from OpenAI and Google.
Visual Intelligence, enabled through Camera Control, a new button for the iPhone 16 and 16 Pro, allows users to ask questions about their surroundings and gather information from pictures they've taken, such as dates and times from a flyer or identifying dog breeds.
🌎Every day, new open-source models compete for the top spot, so it's common to hear claims of different ones being the "leader." However, these claims are sometimes made too soon, and some models fail to live up to the hype. DeepSeek is taking its chance with the launch of DeepSeek-V2.5, now available on Hugging Face. Its benchmark results and performance earn it the "world's top."
🎙️Google is transforming NotebookLM through "Audio Overview," intending to translate complex information into easy-to-digest audio. Basically, your documents can be turned into a "podcast" where two AI hosts highlight the most relevant aspects of your annotations, dive deeper, and summarise the material to enhance comprehension.
⚡️Quick links
What did you think of today's newsletter? |