- AI SPRINT
- Posts
- [AI SPRINT] ChatGPT's New Image Editor, Google Gemini, Napkin.AI, and AI Transformation
[AI SPRINT] ChatGPT's New Image Editor, Google Gemini, Napkin.AI, and AI Transformation
Another big week in AI! This week, I’m covering the brand-new ChatGPT 4o image editing, the latest Google Gemini updates, and Napkin.AI’s new visual tools—plus a short essay on the top 3 things to focus on to make your AI transformation a success.
In a surprise announcement Tuesday, OpenAI (finally) released native image editing capabilities within their 4o model. Now, instead of ChatGPT acting as a middleman to DALL·E, 4o generates images directly—giving you tremendous control. In addition, OpenAI announced a slew of feature improvements, now moving 4o to what might be the best general purpose AI image creator on the market! Just yesterday, I though that award was going to Gemini, but that’s how fast things change in the AI industry. Here’s a short list of key features:
Character Consistency: Characters within an image will stay the same across multiple variations of an image.
Text Rendering! Finally, ChatGPT can generate high-quality, properly spelled text—from handwriting to code.
Multiturn Generation: You can now modify an image without it generating something completely new.
Object Control: It will recognize and track up to 20 objects, giving you fine-grained control of the content within an image.
Sample Image Learning: It will learn from any images you upload, giving the ability to edit, stylize or inform image creation.
Integrated Knowledge: It will apply real-world concepts to images, ensuring consistency, such as the correct steps to boil an egg.
The only drawback for now is fine-grained editing—you’ll still want tools like Adobe Firefly or Photoshop for that. It’s available today for everyone—except Enterprise and Education plans, which are coming soon. Check it out by just creating images with the 4o model selected!
Google Gemini is Now a Top Competitor
Today, ChatGPT holds about 60% of the AI chat app market, with Microsoft’s Copilot a distant second at about 14%. Third place isn’t Claude or Perplexity, as you might expect, but Google Gemini at 13%.
Many people consider Google to have failed at their AI efforts so far–and their share of the AI Chat App market has been steadily declining. However, recently Google announced a variety of impressive updates to their Google Gemini suite of AI products. With Google’s significant share of search traffic and business productivity tools, these enhancements could position Gemini as a real contender.. If you haven’t tested out Gemini, it’s time to do so!
In today’s newsletter, I’m breaking down the recent announcements to give you a view of their AI products, and help you start using these in your workflows.
Deep Research Now Uses Gemini Flash 2.0: As I discussed in my review of deep research products a few weeks ago, Gemini’s Deep Research is compelling for its price and quality, yet has trailed in trained knowledge, capability, and reasoning due to its reliance on the outdated Gemini Flash 1.5 model. Those issues are now gone, with Google making Gemini Flash 2.0 the default Deep Research model, bringing significant quality improvements and making it competitive to ChatGPT’s Deep Research. New features include file upload capability, a full 1 million token context window, and enhanced reasoning. Available through Google’s affordable AI Premium subscription, Gemini Deep Research is now my top recommendation for anyone not already using ChatGPT’s paid product.
Google Image Generation: Google has released a new, native image generation model, just beating to market ChatGPT’s 4o model discussed earlier. It delivers some of the highest quality images, comparable to those of Midjourney, and uniquely allows for high quality editing—ranging from background changes to replacing individual items in the image. Although currently only accessible through the developer-focused AI Studio (choose the Image Generation model), integration into Gemini itself seems imminent.
NotebookLM: This highly popular learning, podcast creation, and data exploration tool has received major upgrades to better serve both individual users and organizations. The paid version offers increased usage limits, new privacy features, and a redesigned interface. The Podcasts it creates are now interactive, with customizable response styles, and it has shared team notebooks with analytics and new security features purpose-built for organizations.
Gemini Live: Similar to ChatGPT’s Voice Mode, Google Android users can now engage in live conversations with Gemini. Also available through the AI Studio portal, which allows for live screen and webcam sharing, Gemini Live is poised to become an indispensable AI assistant. I used it just yesterday to help me with some difficult Microsoft Excel analysis work I was doing—excellent quality.
Gemini Personalization: Google is now integrating user-specific data, such as Google search history, into the AI experience to enhance relevance and utility. This approach is likely to extend soon to include Gmail, Google Docs, and other data in the future, reflecting the trend toward personalized AI experiences. A separate model today, but I expect this to become the standard approach of all AIs.
Chained Actions: Gemini can now interface with other Google products to perform tasks like sending emails on your behalf. This capability is a key component of the upcoming Gemini Agent product, designed to execute actions across the Google ecosystem.
At just $20 per month, Google’s AI Premium subscription offers tremendous value and is a strong contender for your secondary AI subscription choice. If your organization is using Google Workspace instead of Microsoft Office, it might even be time to drop ChatGPT and consider switching to Gemini.
The only drawback: Google’s done a terrible job with the user-experience, making users jump between multiple applications and increasing difficulty of user adoption. As soon as they integrate these features into a single application I think they’ll outpace at least Microsoft M365 Copilot adoption, which is still struggling (and available for a 15% discount right now to try to generate new business).
Tool Spotlight: Napkin AI
For those needing to create visuals to accompany written work, Napkin AI is a must-try. Specifically designed to transform written text into well-crafted diagrams, Napkin AI supports various chart styles, such as flow charts and pyramid diagrams. Users can customize icons, colors, and text, making it a dynamic alternative to the outdated features of Microsoft PowerPoint.
Here’s an example of a diagram I created using Napkin AI, illustrating the components of a company's Innovation Management system. It took about 1 minute to design this, start-to finish, including selecting the right icons and style.
Importantly, Napkin AI is also currently free to use, so go check it out!
Why Most AI Transformations Fail — and How to Succeed
AI adoption isn't merely a tech upgrade—it’s a fundamental business transformation and needs to be approached accordingly. Considering that roughly 70% of digital transformations fail (McKinsey), companies have a valuable opportunity to learn from these failures and improve their odds of success with their AI adoption by focusing on three things: clarifying strategy, aligning leadership, and committing appropriate resources.
First, companies often launch AI initiatives without a clear vision—a "Let's deploy AI" approach without specific, measurable goals. McKinsey emphasizes that successful transformations are grounded in clearly defined objectives. Treating AI as a strategic business transformation rather than just another tech project sets the stage for meaningful, sustainable outcomes. If you are just beginning your journey: work with your leadership team to set clear goals and timeframes before investing.
Second, leadership alignment is essential. Research indicates that transformations thrive when executives communicate a unified vision and consistently reinforce shared goals. Without leadership alignment, even the strongest strategies can devolve into confusion, inefficiency, and stalled progress. For AI transformation, assign a single accountable leader to drive adoption strategy creation and alignment across the organization.
Third, transformation requires committed resources. Many transformation efforts falter due to insufficient investment in talent, technology, or infrastructure. AI initiatives, in particular, demand specialized skills, advanced tools, and substantial support. Without adequate resources, promising AI projects can quickly stall or become isolated experiments that never scale. For AI, ensure your tech team is upskilled, providing your staff time to learn these new tools, and find the right partners to assist in your efforts.
Ultimately, companies that succeed with AI recognize that it’s a comprehensive business transformation, requiring clarity, alignment, and commitment at every level. By learning from common pitfalls in digital transformation and proactively addressing these three critical areas, organizations position themselves to capture AI’s true strategic value.
As always, if your leadership team could use help on your AI transformation efforts, reach out!
Action Step: Check out ChatGPT’s new 4o image generator, Google Gemini and Napkin AI’s products, and see if they can help your productivity. If you are responsible for AI efforts in your company, drive leadership alignment on clear goals for AI use, and ensure your organization is sufficiently resourced for success.
Tell us what you thought of today's email. |
Did someone forward this newsletter to you? If you're not already signed up, you can subscribe to AI SPRINT™ for free here.