# AI Weekly #31

Your weekly roundup of AI breakthroughs and developments

Aug 04, 2024

This Week in AI: Advancements, Investments, and Concerns

The AI Weekly brings you the latest happenings in the exciting world of Artificial Intelligence (AI) you might have missed last week. From groundbreaking research to real-world applications, here's a glimpse of what's making headlines:

AI in Business & Tech

🥱 Alibaba Unveils AI Assistant: Alibaba is launching a conversational AI-powered sourcing engine to streamline B2B sourcing for small and medium-sized businesses.
🧐 Labour Axes £1.3bn Tech and AI Funding: The UK's new government has halted funding for tech and AI projects, sparking industry outrage and concerns over future competitiveness.
😟 Nvidia Stumbles on Next-Gen AI Chip: A design flaw has delayed Nvidia's next-generation AI chip, impacting production and potentially affecting major tech players.
🤩 Meta Unveils Celebrity Voices for AI Assistant: Meta introduces AI assistants with celebrity voices, adding a new layer of personalization to user interaction.
🙊 OpenAI's Advanced Voice Mode: ChatGPT users gain access to a new feature mimicking natural human speech for more engaging interactions.
👥 Friend Launches AI Companion Necklace: A new wearable device that aims to combat loneliness by offering constant AI companionship and conversation.
🤖 Gemini 1.5 Pro Takes the Lead in Chatbots: Google DeepMind's advanced chatbot impresses with its ability to understand and respond to complex instructions across various modalities.
🖼️ Black Forest Labs Releases FLUX.1 Image Model: This open-source image generation model excels at rendering precise text and complex compositions.
🦾 Figure Raises $675M, Showcases Advanced Humanoid Robot: A Bay Area robotics firm raises significant funding and collaborates with OpenAI to enhance its robots' capabilities.
🙅 Microsoft Officially Lists OpenAI as a Competitor: The landscape of AI heats up as Microsoft recognizes OpenAI as a rival in both search and AI markets.
🌮 Taco Bell Expands AI-Powered Drive-Thru Plans: By year-end, Taco Bell plans to integrate voice AI technology in hundreds of locations to improve efficiency and customer experience.
🖼️ Midjourney V6.1 Image Model Release: Midjourney releases an updated image model offering improved image coherence, quality, and detailed features.
🖼️ Stability AI Introduces Stable Fast 3D: This model allows for rapid generation of high-quality 3D assets from a single image, ideal for diverse applications.
🧑‍✈️ GitHub Launches 'GitHub Copilot' for Developers: A suite of AI-powered models assists developers with real-time code suggestions, streamlining the coding process.
📜 Character AI Open-Sources 'Prompt Poet': This tool simplifies prompt design for AI interactions, making it easier for both developers and non-technical users.
🥽 NVIDIA Utilizes Apple Vision Pro for Robot Training: New tools accelerate robot training by providing synthetic data generation and AI-powered workflows.
📄 Google Reveals and Open-Sources Gemma 2 AI Models: This suite focuses on safety, efficiency, and transparency in AI deployment.
📹 Vimeo Launches AI Video Translation: Creators can now translate video audio and captions into 29 languages using Vimeo's AI-powered feature.
🖼️ Runway Releases Gen-3 Image-to-Video Tool: Users can generate high-quality video clips from still images or text prompts with this advanced AI model.
📄 Meta Introduces SAM 2 for Real-Time Object Segmentation: This model efficiently segments objects in both images and videos, enhancing video editing experiences.
💵 Apple Reveals Investment in AI Infrastructure: Apple invests significantly in AI infrastructure while acknowledging Google's hardware support in their training process.
💵 Canva Acquires Leonardo AI to Boost Design Capabilities: Canva bolsters its design software by acquiring an advanced generative AI platform.
🧑 Synthesia Introduces "Personal Avatars": Users can create lifelike digital representations of themselves for video content creation with Synthesia's new feature.
💵 Perplexity's Publisher Revenue Sharing Model: This initiative aims to ensure fair compensation for content creators whose work is used by Perplexity's AI.
🚶‍Character AI Founders Join Google: Google bolsters its conversational AI capabilities by hiring Character.AI's cofounders and licensing their language models.

Beyond the Headlines

🎼 AI Music Generators Face Copyright Lawsuit: Music labels are suing AI companies for copyright infringement, raising questions about applying existing copyright laws to AI-generated music.
🕵️ Sam Altman: The New Face of Power?: Concerns emerge surrounding the rising influence of OpenAI's CEO and the potential ethical implications of unchecked AI development.

Stay tuned for our next newsletter, where we'll explore more advancements and discussions shaping the future of AI!

Alibaba Unveils AI Assistant

Alibaba is launching an AI-powered assistant called a "conversational sourcing engine" in September” to facilitate B2B sourcing for small and medium-sized businesses (SMBs). This tool aims to streamline the search for business partners and products by using natural language processing and AI to create professional sourcing requests. It will allow direct comparisons of suppliers, reducing costs and enhancing SMBs' global trade capabilities. The initiative builds on Alibaba's existing AI applications, integrating advanced technology into its eCommerce operations.
For more details: https://www.prnewswire.com/news-releases/alibaba-unveils-the-worlds-first-ai-powered-conversational-sourcing-engine-302210977.html

Labour Axes £1.3bn Tech and AI Funding, Sparking Industry Outrage

The new Labour government in the UK has decided to halt £1.3 billion in funding previously promised by the Conservative administration for tech and Artificial Intelligence (AI) projects. This funding included £800 million for the development of an exascale supercomputer at Edinburgh University and £500 million for AI Research Resource. The Department for Science, Innovation and Technology (DSIT) explained that the previous administration never allocated these funds in the budget, prompting the decision.

Industry critics argue this move could drive entrepreneurs to the US and harm the UK's position in crucial future industries. TechUK, a trade body, emphasized the need for quick new proposals to avoid losing out globally. DSIT defended its decision, stating it was necessary for economic stability and growth.

The future of the Edinburgh exascale supercomputer, expected to be significantly faster than current UK computers, remains uncertain. The University of Edinburgh, having already invested £31 million in the project, is seeking urgent discussions with the government. The tech sector, valued at £863 billion in early 2024, is a vital part of the UK economy, and continued investment in technology infrastructure is crucial for maintaining its competitive edge and achieving scientific breakthroughs.
For more information, read the BBC article.

Amazon's AI Bet Backfires, Bezos Wealth Takes $21B Hit

Jeff Bezos' wealth has decreased by $21 billion following a significant drop in Amazon's stock. The decline is attributed to concerns about the company's investment in AI and its potential impact on profitability. Investors are wary of Amazon's aggressive spending on AI technology amidst uncertain returns. This has led to a broader reevaluation of tech investments, reflecting the sector's volatile nature as companies navigate the integration of advanced technologies.

TradingView chart — Created with TradingView

For more details, visit the Hindustan Times.

AI Music Generators Face Copyright Lawsuit

Music labels are suing AI companies like Udio and Suno for copyright infringement, arguing that these companies use AI to generate songs mimicking popular artists without proper licenses. This legal battle highlights the complexity of applying existing copyright laws to AI-generated content. The lawsuits contend that AI systems trained on copyrighted music threaten the livelihoods of human artists by creating imitations that could flood the market. The cases are expected to test the boundaries of "fair use" in the context of AI and could have significant implications for the music industry. For more details.

Sam Altman: The New Face of Power?

Sam Altman, CEO of OpenAI and creator of ChatGPT, has become one of the most influential figures in AI, but his rise has also sparked significant concern. Altman, previously celebrated for his innovative work and global economic vision, is now facing scrutiny for potential ethical lapses and misleading representations. Despite publicly supporting AI regulation, reports suggest OpenAI has worked to weaken such regulations behind closed doors. Allegations of misleading information, questionable business practices, and insufficient focus on AI safety have marred Altman's reputation. Critics argue that the unchecked power of AI companies like OpenAI poses substantial risks, calling for stronger government oversight and a global cooperative effort to ensure AI development benefits humanity. For more details, visit the Guardian.

Nvidia Stumbles on Next-Gen AI Chip, Delays Production

Nvidia has reportedly delayed its next-generation Blackwell B200 AI chips by at least three months due to a design flaw discovered late in the production process. Mass shipments are now expected in early 2025. This delay was communicated to major cloud customers, including Microsoft. The B200 chips were intended to replace the successful H100 chips that have driven Nvidia's recent financial success.

Nvidia is currently conducting new test runs with Taiwan Semiconductor Manufacturing Co. (TSMC), the manufacturer of its chips. Despite the delay, a Nvidia spokesperson stated that production is still expected to ramp up later this year.

The delay could impact Nvidia's revenue and stock performance, as well as affect TSMC and rival AMD. Major customers like Microsoft, Google, and Meta have placed significant orders for these next-gen chips to support their AI initiatives.

Last week, Nvidia's stock fell 5.1%, AMD's stock dropped 5.35%, and TSMC's stock retreated 7.5%. Microsoft, Meta, and Google also experienced stock fluctuations, reflecting broader market trends and concerns over Nvidia's AI dominance being probed by the Justice Department.

Celebrity Voices Power New Meta AI

Meta has introduced an AI voice assistant featuring celebrity voices, including Judi Dench, Awkwafina, and Keegan-Michael Key. These voices aim to enhance user engagement by offering a more personalized and entertaining interaction experience. The initiative is part of Meta's broader strategy to integrate advanced AI technologies into its platforms, making interactions more dynamic and appealing.

OpenAI's Advanced Voice Mode

OpenAI has launched an Advanced Voice Mode for ChatGPT, designed to make AI interactions more natural and conversational. This new feature, initially available to a select group of ChatGPT Plus subscribers, allows for real-time audio responses that mimic human speech, including natural pauses and emotional intonations. It enhances user interaction by enabling interruptions and adjusting responses dynamically. Safety measures are in place to prevent impersonation and misuse of the technology, with a broader rollout expected later in 2024

Friend launches AI companion necklace.

"Friend" is a new AI companion device developed by Avi Schiffmann and backed by founders from Solana, Perplexity, and ZFellows. This $99 wearable necklace is designed to combat loneliness by providing constant companionship and conversational interaction. It utilizes advanced AI to mimic friendly interactions, offering users emotional support and engaging in casual banter. The device, which avoids functionalities like problem-solving or meeting transcription, is set to launch in January 2025 and aims to enhance social connectivity through technology.

Gemini 1.5 Pro topped the chatbot arena.

The Gemini 1.5 Pro is an advanced AI chatbot hailed as the new leader in its category. Developed by Google DeepMind, this AI features a long context window, enhancing its ability to understand and respond to complex human instructions across various modalities such as text, audio, and visual inputs. The system, known as “Mobility VLA,” enables robots to navigate environments and perform tasks based on natural language commands. This innovation represents a significant step forward in AI and robotics integration, promising more seamless human-machine interactions in the future.

Black Forest Labs dropped Flux 1

FLUX.1 is an open-source image generation model developed by Black Forest Labs, now available on BasedLabs. It excels in rendering precise text within images, creating complex compositions, and producing anatomically accurate human features, particularly hands. Users can generate high-quality, detailed images by selecting the FLUX.1 model, entering specific prompts, and downloading the resulting images. This tool is designed to create detailed designs like signage, book covers, and branded content, offering significant improvements over previous models.

Figure launches 'most advanced' humanoid robot

Figure, a Bay Area-based robotics firm, has achieved a $2.6 billion valuation after raising $675 million in a Series B funding round. The company, which specializes in humanoid robots, has collaborated with OpenAI to enhance its robots' capabilities. The funding will accelerate their go-to-market strategy, including a deal with BMW to deploy robots. The Figure 01 robot, known for its dexterity and autonomous capabilities, showcases advancements in real-world task performance and complex environment navigation, marking a significant step forward in robotics and AI integration.

Microsoft officially listed OpenAI as a competitor.

Microsoft has identified OpenAI as a competitor in both artificial intelligence and search markets. Despite their previous partnership and Microsoft's investment in OpenAI, the latter's advancements and new product launches, such as AI-powered search tools, have positioned it as a rival. This shift highlights the growing competition in the AI industry, with major tech companies vying for dominance in AI-driven technologies and services.

Taco Bell revealed its huge AI drive-thru plans.

Taco Bell is set to expand its AI-powered drive-thrus to hundreds of locations by the end of 2024. The initiative involves implementing Voice AI technology, which has already been tested in over 100 locations across 13 states in the U.S. This technology aims to improve order accuracy, reduce wait times, and alleviate workloads for employees, allowing them to focus more on customer service. This rollout is part of a broader strategy by Yum! Brands, Taco Bell's parent company, to enhance operational efficiency and customer experience through AI integration. Similar initiatives have been observed in other fast-food chains like Wendy's and White Castle.

Midjourney released V6.1

Midjourney has released version 6.1 of its image model, introducing several enhancements such as improved image coherence, better quality, and more detailed small features. The update includes new 2x upscalers, faster standard image processing, and improved text accuracy. A new personalization model adds nuance and accuracy, with support for personalization code versioning. While V6.1 lacks new inpainting/outpainting models, users can switch back to V6 if needed. Midjourney plans to release version 6.2 soon, making V6.1 the default model for now.

Stability AI introduced Stable Fast 3D

Stability AI has introduced Stable Fast 3D, a model that generates high-quality 3D assets from a single image in just 0.5 seconds. This tool is built on the TripoSR foundation and features significant architectural improvements. It creates detailed 3D assets, including UV unwrapped meshes and material parameters, making it ideal for game development, virtual reality, retail, and design. The model is accessible via Stability AI's API and Stable Assistant chatbot and is available on Hugging Face under a community license.

GitHub launched 'GitHub models'

GitHub has introduced GitHub Copilot, a suite of AI-powered models designed to enhance software development. These models offer real-time code suggestions, generate code snippets, and automate repetitive tasks, significantly boosting productivity. They are integrated into the GitHub platform and support multiple programming languages. GitHub Copilot aims to streamline the coding process, making it faster and more efficient for developers by leveraging advanced AI capabilities.

Character AI open-sourced 'Prompt Poet'

Character.AI has introduced Prompt Poet, a tool designed to streamline prompt design for AI interactions. This approach shifts from traditional prompt engineering, which involves complex string manipulations, to a more intuitive design process. Prompt Poet allows developers and non-technical users to create and manage prompts efficiently using a combination of YAML and Jinja2 templates. This tool enhances productivity by focusing on crafting precise, engaging prompts while handling complexities like tokenization and truncation seamlessly.

NVIDIA uses Apple Vision Pro for robot training

NVIDIA is advancing humanoid robotics development by offering new microservices, cloud orchestration, and AI workflows. The NIM microservices accelerate deployment times, while the OSMO service simplifies robotics development workflows. These tools enable developers to generate synthetic data and train models efficiently. NVIDIA also introduced an AI-enabled teleoperation workflow for capturing human demonstration data. This initiative supports the development of humanoid robots with enhanced capabilities and faster production cycles.

Google revealed and open-sourced Gemma 2 2B

Google has introduced Gemma 2, a series of AI models focusing on safety, efficiency, and transparency. The lineup includes the Gemma 2 2B model for on-device use, ShieldGemma for content safety, and Gemma Scope for model interpretability. These tools enhance AI deployment by ensuring safe interactions and providing deeper insights into AI decision-making. The models are optimized for various hardware and are accessible for research and commercial use.

Vimeo launched AI video translation.

Vimeo has launched an AI-powered video translation feature that allows users to translate video audio and captions into 29 different languages. This technology utilizes generative AI to clone the original speaker’s voice, ensuring a natural and authentic translation. The feature aims to enhance accessibility and global reach for video content creators by simplifying the translation process without requiring additional third-party services.

Runway released Gen-3 image-to-video

Runway has introduced Gen-3 Alpha, an AI model capable of transforming still images into high-quality videos. This advanced tool allows users to generate detailed and realistic video clips up to 10 seconds long, using either single images or text prompts as the starting point. The new model promises to enhance creative workflows by simplifying the video creation process, making it more accessible for various applications, from entertainment to professional content production.

Meta introduced SAM 2

Meta has unveiled the Segment Anything Model 2 (SAM 2), an AI model capable of segmenting objects in both images and videos in real-time. SAM 2 improves upon its predecessor by consistently tracking objects across video frames, facilitating tasks like video editing, mixed reality experiences, and visual data annotation for training computer vision systems. This model addresses challenges like fast-moving objects and occlusions, aiming to enhance video content creation and interaction.

Apple revealed Apple’s Intelligence technical report.

Apple is significantly increasing its investment in AI for 2024 but remains behind competitors like Microsoft and Google. The company's AI initiatives are focused on enhancing existing products and services, such as Siri and Apple Music, rather than pursuing groundbreaking AI research and development. This conservative approach has led to slower progress compared to its Silicon Valley peers, who are making more substantial advancements in the field. Apple has acknowledged using Google Tensor hardware to train its AI models, specifically the Apple Foundation Model (AFM). The training was conducted on Google Cloud TPU clusters, which Apple likely acquired and used within its own data centers. This move aligns with Apple's increased investment in AI, aiming to enhance its capabilities and compete with industry leaders. The company plans to invest over $5 billion in AI server enhancements over the next two years.

Leonardo AI was acquired by Canva.

Canva has acquired Leonardo AI, a generative AI platform, to bolster its design software capabilities. This strategic move aims to enhance Canva's offerings by integrating Leonardo's advanced AI tools, allowing users to create more sophisticated and unique designs. The acquisition aligns with Canva's vision to democratize design by making powerful creative tools accessible to everyone, from professionals to beginners.

Synthesia introduced "Personal Avatars"

Synthesia has introduced personal avatars, allowing users to create lifelike digital representations of themselves for use in video content. This feature aims to enhance personalized communication and content creation, providing users with an easy way to generate videos featuring their digital double. The personal avatars can be customized and utilized across various professional and personal applications, streamlining the video production process.

Perplexity's publisher revenue sharing

Perplexity AI has introduced a revenue share model to benefit publishers whose content is used by its AI. This initiative aims to address concerns about fair compensation for content creators, ensuring they receive a portion of the revenue generated by the AI's usage of their work. This move is part of Perplexity AI's broader strategy to foster a more collaborative relationship with content providers and maintain a sustainable ecosystem for AI and publishers.

Character AI's cofounders head to Google.

Google has hired the cofounders of Character.AI and licensed its language models to enhance its AI capabilities. This move reflects Google's strategy to bolster AI technology by integrating advanced conversational AI models. The co-founders, Noam Shazeer and Daniel De Freitas, will bring their expertise to Google, further driving innovation in AI-powered interactions and services.

Subscribe to my newsletter to learn more and stay updated on the latest AI, Innovation, Cybersecurity, Robots, and Technology developments.

Victor’s Substack

Discussion about this post

Ready for more?