Ai Master

Top AI News in April 2024

#News Center ·2024-04-30 09:49:29

In this month’s roundup, we highlight the top AI news from April:

Adobe Buys Video for Its AI

Adobe is actively buying video footage to develop its AI text-to-video generator. The company is encouraging its network of photographers and artists to submit videos depicting everyday actions and emotions, with an average payout of $2.62 per minute of video. In addition, Adobe is exploring partnerships with third-party AI providers such as Runway, Pika Labs, and OpenAI’s Sora model.

The background: The company’s growing interest in buying videos from photographers and artists reflects a recent trend of companies relying on licensed content to train AI models. By obtaining the proper licenses, companies can reduce legal risks while obtaining high-quality datasets for training models.

In addition to this, Adobe will also introduce AI video tools to its Premiere Pro editing platform and plans to integrate its own generative AI video models into the Firefly series. These tools include the ability to generate and process video content using text prompts, aiming to enhance the user’s editing experience.

Adobe Firefly training data raises ethical concerns

Adobe’s image generation software Firefly, which has been praised for its ethical training data practices, has sparked controversy after it was revealed that it was trained using images from sources such as Midjourney.

Although Adobe initially claimed that Firefly relied primarily on licensed images from Adobe Stock, it appears that AI-generated content (including from its competitors) has also contributed to Firefly’s training. Adobe Stock is one of the few stock photo platforms that accepts content generated by third-party services. Therefore, since Adobe uses its content to train its algorithms, the inclusion of third-party generated content in Adobe Stock has inadvertently contributed to the training data of tools such as Firefly.

Yet, despite the revelations, Adobe still claims that it controls the quality of its datasets:

“Every image submitted to Adobe Stock, including a small subset of images generated with AI, undergoes a rigorous review process to ensure it does not contain intellectual property, trademarks, recognizable characters or logos, or references to artists.”

Adobe spokesperson

Between the lines: This discovery challenges Firefly’s claims of being a “business-safe” alternative and raises questions about transparency and ethical standards in the development of AI models.

Meta AI’s global rollout

Powered by Meta Llama 3, Meta AI is expanding its global reach with new features designed to make everyday tasks easier and more enjoyable.

Meta AI is now live on Facebook, Instagram, WhatsApp, and Messenger, and available in more than a dozen countries, including Australia, Canada, and Nigeria. Users can now rely on Meta AI to complete a variety of tasks, from recommending restaurants based on specific preferences to explaining complex concepts like genetic traits.

In addition, Meta AI has been integrated into the Meta ecosystem, including search functionality and image generation capabilities, improving the user experience across platforms. With the Imagine feature, users can generate images from text in real time, with clearer image quality and support for adding text to images.

Background: As the AI race continues, Meta is clearly stepping up its efforts to narrow the gap with its competitors and try to become a leader in the field of artificial intelligence.

Snap watermarks AI-generated images

Snap announced plans to watermark AI-generated images on its platform with a semi-transparent Snap logo and a shiny emoji as watermarks. The move is intended to highlight images created using Snap AI tools, thereby increasing transparency and safety for users.

The company clarified that removing these watermarks would violate its terms of use, but the method for detecting such removals has not yet been made public. In addition, Snap has introduced AI feature indicators and context cards for AI-generated images to provide users with more information.

Between the lines: Snap's decision is consistent with similar initiatives by tech giants such as OpenAI and Meta to mark AI-generated content, and is also in line with the growing trend of transparency and content provenance.

Coca-Cola x Microsoft

The Coca-Cola Company and Microsoft have entered into a five-year strategic partnership to accelerate cloud and GenAI initiatives. Coca-Cola has committed $1.1 billion to Microsoft’s cloud and GenAI capabilities, marking a major step in its ongoing technology transformation. With Microsoft Azure and AI, Coca-Cola aims to revolutionize every business function from marketing to manufacturing and supply chain management. By moving all applications to Microsoft Azure and exploring AI-driven digital assistants, Coca-Cola is committed to improving customer experience, streamlining operations, promoting innovation, and uncovering new growth opportunities.

Background: Coca-Cola is an example of how non-tech brands can use artificial intelligence to gain a competitive advantage. Using AI, Coca-Cola has enhanced supply chain management, streamlined distribution processes, and enhanced customer experience. In addition, Coca-Cola recently partnered with OpenAI to launch the “Masterpiece” campaign, which showcases the brand’s innovative marketing approach.

AI in Healthcare Operations

Profluent Bio has used the power of GenAI to develop a breakthrough gene editor called OpenCRISPR-1. The company used its proprietary large-scale language model for protein design, ProGen2, to train on a massive database of Cas9 gene-editing proteins. This innovative approach ultimately created novel gene-editing proteins capable of modifying human cells. The team also employed another AI system to generate the guide RNA required for precise targeting. While the design software remains proprietary, Profluent has decided to open up OpenCRISPR-1 to researchers, marking a major advance in the field of gene editing.

Moderna, a Cambridge-based pharmaceutical and biotech company, has partnered with OpenAI to integrate ChatGPT Enterprise into all of its operations. Committed to broad adoption, Moderna has launched an ambitious program to ensure that all employees are proficient in GenAI technology. By fostering a culture of collective intelligence and investing in a comprehensive change management program, Moderna has achieved impressive results, including more than 80% of employees successfully adopting an AI chatbot tool built on the OpenAI API mChat. In addition, Moderna has pioneered the use of AI technology in clinical trial development and has launched innovative solutions such as Dose ID, which simplifies data analysis and enhances decision-making processes.

Why it matters: These examples exemplify how AI is helping to change the world, especially healthcare, for the better.

AI Film Conference

AI on the Lot is preparing for an AI Film Conference on May 16, 2024, at LA Center Studios that will attract more than 500 AI enthusiasts, filmmakers, and professionals. The event will feature film screenings, in-depth panel discussions with industry leaders, hands-on workshops, and live demonstrations exploring the intersection of AI and filmmaking.

The 2024 AI on the Lot conference will feature a number of high-profile speakers, including Katja Reitemeyer, director of data science and AI at NVIDIA; Kathryn Brillhart, virtual production supervisor for films like Fallout and Rebel Moon; and Chad Nelson, creative expert at OpenAI. The conference will focus on how the convergence of technology and creativity will shape the future of entertainment.

Alexander Shironosov, head of the R&D team at Everypixel, dives into the latest versions of the AI models:

LLM:

Mistral – Mixture of Experts Mixtral-8x22B: A new large model that leverages a mixture of experts architecture to improve performance and efficiency.

Meta’s llama3 launch: Meta launched two versions of the llama3 model, with 8B and 70B parameters. The 8B version performs on par with the larger llama2 70B model.

Microsoft’s Phi 3: Following the successful deployment of phi1 and phi2 in small VLMs, Microsoft launched phi3. Early metrics based on phi3 training presented by ShareGPT4v indicate that phi3 outperforms heavier models, suggesting broad potential for adoption in similar applications.

Apple’s OpenELM initiative: Apple has launched a series of small open source AI models, called OpenELM, designed for on-device applications. The models vary in size - 270 million, 450 million, 1.1 billion, and 3 billion parameters.

Fineweb release: FineWeb dataset, a collection of text datasets from the web (CommonCrawl), released under a license (ODC-By).

Dolma update: Dolma, a 3 trillion labeled dataset of web content, academic publications, code, books, and encyclopedic material, has released an updated version.

Snowflake’s Arctic base model: Snowflake released Snowflake Arctic and published a detailed exploration of its model, which uses a mixed-expert architecture that enhances its ability to handle a variety of AI tasks.

Innovation from startup Answer.AI: Answer.AI published an article and released code for its FSD/DORA approach, which allows training large-scale llama3 on just two video cards with 24GB of video memory each, demonstrating an efficient way to manage resource-intensive AI training.

Volumetric Flow Meter (VLM):

InternVL 1.5: This open source model has a powerful visual encoder and is trained on high-quality datasets for images of various sizes, from 448×448 to 4kx4k. In some ways, InternVL 1.5 outperforms top commercial models such as GPT-4v, Claude Opus, and Gemini 1.5 Pro.

New Benchmark for Testing Visual Language Models (VLM): A new version of the benchmark designed for testing visual language models on images containing a lot of text has been released. This benchmark aims to provide a more rigorous evaluation of VLM's performance in handling complex visual-textual interactions, which is critical for improving its real-world applications.

Video Generation:

Microsoft's Talking Head Model: Microsoft has introduced a new model that generates "talking face" videos from audio input and photos. The model uses a diffusion model and significantly outperforms previous methods on all major performance metrics. This release has the potential to revolutionize the way dynamic video content is created from static images and sounds.

Image Generation:

Imgsys Text-to-Image Model: A new platform called Imgsys has been launched to facilitate pairwise comparisons and build Elo ratings for various text-to-image models. This includes checkpoints for models such as SDXL as well as standalone models comparable to Pixart-Sigma.

NVIDIA’s Diffusion Model Enhancements: NVIDIA published two papers detailing methods for improving image generation quality using diffusion models without direct model retraining. The first method leverages classifier-free scheduling to enhance image sharpness, while the second method optimizes the denoising step to further optimize the output.

Improved Portrait Generation IP Adapter: We developed an enhanced IP adapter for generating accurate and detailed portraits from photographs. The tool uses advanced image processing techniques to improve the realism and quality of generated portraits.

Meta’s Diffusion Model Acceleration: Meta published a paper detailing their new method, “Imagine Flash,” to accelerate diffusion models through a technique called “reverse distillation.” This method significantly speeds up the processing time of diffusion models while maintaining or even improving the quality of generated images.

Adobe Firefly v3 for Photoshop: Adobe has introduced Firefly v3, a new version of its integration into Photoshop. The tool allows users to degrade specific objects, change backgrounds, and generate new images from scratch.