Tag Archives: AI

Not getting left behind: AI business transformation

By Eric Picard

Here’s something that people are missing about the changes coming from AI. You can’t just treat AI as a productivity tool. You can’t just use it to speed up document creation, you have to turn it into a thinking tool, a tool to make your ideation and decision-making better.

If the CEO and the rest of the leadership team are not treating AI as a core capability across all aspects of the organization, their company is going to be standing still while competitors run by. So what do I mean by “core capability across all aspects of the organization?”

AI used correctly creates a workbench of hyper-intelligent advisors who can augment the intelligence of the leadership team – especially the CEO – so that they can make better decisions in less time. It needs to come from the top down, and it needs to be mandated. And not delegated to the IT team, whose job is to get the best price with the highest uptime with the lowest number of support tickets (great for many things but bad when trying to do business transformation.) Companies need to keep AI adoption at very senior levels abstracted from the IT team. And must be willing to live with the friction this causes.

In my practice I’m seeing that when organizationally, the leadership team starts using AI tools in their daily workflow, ingraining them into their habits, the whole org starts moving faster. It’s critical that AI is used not only for document creation (boy we’ve really sped up writing – what a productivity boost!) , but for ideation (our ideas are now battle tested and pushed around and stronger before we start executing!) that things really change. It’s critical that we’re clear – this is not abdicating their ideation to the AI, but bouncing ideas off the AI, getting feedback through a variety of points of view.

If you don’t understand what I’m talking about, try this simple experiment:

Create several specific personas – Here are some basic prompts that will help you try this out:

World Class CEO: You are a world class CEO with 25 years of leadership experience in the _______ industry. You are now retired, but mentoring other CEOs and leadership teams. You are cranky – you don’t glad-hand your mentees, you don’t tell them what they want to hear. You are inherently kind, you don’t argue for the sake of arguing, but you don’t hold back when you see mistakes being made. Your point of view is that _______________ and ____________ are great leaders who you try to emulate. And ____________ and _________ are overrated.

World Class CFO: You are a world class CFO with 25 years of experience at a combination of large publicly traded companies and startups that went through rapid growth and went public under your leadership. You understand the needs of a growing business, as well as the concerns of a large publicly traded company with regulatory oversight in the following industries _______________, _______________, ________________.

Innovative Entrepreneur: You are a massively successful entrepreneur in the ________________ and ____________ industries. You’ve broken each of these industries previously by disrupting the status quo and driving incredible success doing things that nobody expected to be successful. You are seasoned enough that you understand the issue of inertia, and you are not afraid of experimenting and failing. You’ve been through 3 acquisitions, where you had to transition to roles within large companies and you understand how big companies differ from startups, and were surprisingly successful navigating these organizations. You’re somewhat brash, and you tell the truth regardless of who you’re talking to.

Finish these prompts off by putting in the industry and names (the names have to be well known figures) and copy/paste these into your AI tool of choice before starting to run ideas by them. You’ll be shocked at the outcome, and the difference of opinion you get from each of these.

Now imagine if you had a stable of 20 or 30 or 100 of these predefined prompts that you could pull out whenever you needed them. Imagine if you debated all your big decisions with a large group of experts with strong points of view. Do you think your ideas would come out the other side stronger or weaker?

Having done this a lot – I can tell you my experience. I often disagreed with the input I was getting, which was half the value. It helped me become clearer on my own point of view. I got feedback I didn’t like, but sometimes it was what I needed to hear. When I pulled in advice from “experts” with different skillsets than my own, I got really valuable expansions of my thinking. For instance – when I was CPO at Bark, I knew very little about supply chain when I started, and became pretty knowledgeable over the time I was there. Part of the way I did that was to create a supply-chain expert prompt who I could run my ideas by. Note – I also became pretty close with our leadership on supply chain and had weekly meetings with various folks to get smarter faster. But I could ask really dumb questions of my AI expert, as often as I wanted, and then take my refined understanding to those other meetings.

It’s like having a superpower. And anyone not doing this in 2 years is going to be left behind.

I happen to use an AI workbench called CharmIQ that makes this much easier. You create “charms” in this tool, which are saved prompts like the ones I describe above. You can assign these prompts to any LLM – CharmIQ comes with baked in access to all of them for one monthly subscription fee. It makes things much easier.

If you want to try it out, feel free to use this link, it gives you a discount and puts some change in my pocket. I will tell you I use this tool literally hours every day. It’s a game changer.

Click Here to Try CharmIQ.

Tagged , , , ,

How AI is going to change software development

by Eric Picard

As someone who’s spent decades watching technology waves crash over the software industry, I find myself constantly recalibrating my predictions about AI’s impact on how we build software. Two years ago, I wrote about the potential for AI to create massive productivity gains and fundamentally alter development practices. Now, with some genuinely surprising developments happening right under our noses, I can see we’re heading somewhere much more interesting than I initially thought.

The productivity gains from AI tools like GitHub Copilot, ChatGPT, Cursor, and Replit are real, but what’s caught my attention is how we’re seeing two distinct paths emerge. Professional developers are using AI as incredibly powerful assistants, while non-developers are essentially managing teams of AI-powered junior developers through conversational interfaces. Both approaches are working, just for different types of projects and different scales of ambition.

Two Worlds of AI-Powered Development

Professional developers have largely embraced AI as sophisticated tooling that makes them dramatically more effective. They’re using AI for code generation, debugging assistance, refactoring, and rapid prototyping, but they’re doing it within established architectural patterns and development methodologies. The productivity gains are genuine – I’m consistently hearing reports of 30-50% improvements in specific tasks – but these developers maintain architectural oversight and code comprehension. I’ve talked to several developers who are often referred to as “10x” developers, meaning they’re ten times more effective than others on their teams. These particular developers were 10x devs at big tech companies, so certainly more than 10 times most developers. Each of them has told me that they are 100X-ing themselves using AI. So this is not a small change.

Then there’s what Andrej Karpathy coined “vibe coding” – developers describing what they want in plain English and accepting AI-generated code without necessarily understanding every line. This “fully giving into the vibes” approach emphasizes rapid experimentation over careful architectural planning. Y Combinator reported that 25% of their Winter 2025 batch had codebases that were 95% AI-generated, which represents a fundamentally different relationship with code creation.

The key insight is that these aren’t competing approaches – they’re serving different needs. Professional development teams use AI to accelerate work within proven frameworks, while vibe coding enables non-developers or small teams to build functional applications that would have been impossible for them to create just a few years ago.

Replit: DevOps Intelligence for the AI Era

Replit’s explosive growth – from $10M to $100M ARR in less than six months – illustrates something important about where this is heading. Replit is actually the intelligent evolution of DevOps principles, bringing continuous integration, automated testing, and deployment automation to AI-driven development.

Traditional DevOps emerged because managing infrastructure, deployment pipelines, and scaling manually was becoming impossible at scale. Smart development teams adopted CI/CD, automated testing, infrastructure as code, and monitoring because these practices made complex systems manageable and reliable.

Replit takes these same principles and makes them accessible through conversational interfaces. When their AI Agent selects a tech stack, generates code, sets up databases, and handles deployment, it’s not eliminating DevOps – it’s automating DevOps intelligence so that non-developers can benefit from these practices without needing to understand them deeply.

This matters because it’s expanding who can build functional software. You don’t need to understand Kubernetes, Docker, CI/CD pipelines, or infrastructure configuration to get the benefits of modern deployment practices. The AI handles the complexity while applying proven DevOps principles under the hood.

The Custom Application Renaissance

What excites me most about this trend is that we’re finally approaching the custom application future that the internet promised but never quite delivered. For twenty years, we’ve talked about having rich, customized web applications for internal business processes, but the development overhead made it impractical for most organizations.

Now we’re entering an era where custom web applications for fairly complex business tasks can be built quickly and cost-effectively. The “intranet” that organizations have wished for – dynamic, task-specific applications that actually solve their particular workflow problems – is becoming achievable. Just this week I built two very powerful internal apps for one of my clients inside of CharmIQ. These apps automate extremely intensive processes that were bottlenecks for my client. And three weeks ago, I had no idea how to do this. I’d have hired someone to build them.

I’m hearing about custom applications for everything from inventory management to customer onboarding to internal reporting, applications that would have required months of development and significant ongoing maintenance. These aren’t replacing enterprise software entirely, but they’re filling the gaps where off-the-shelf solutions don’t quite fit.

The Architecture Challenge Ahead

As these AI-powered development approaches mature, we’re approaching a fundamental architectural question. Current approaches work well for their respective use cases, but we need architectural patterns optimized for AI collaboration rather than just AI assistance. Right now AI developers are about as talented as junior developers – with maybe 2-3 years of experience. They break things a lot, the code isn’t efficient, they’re not always architecting things properly. But that’s a short-term problem – we’re only a few years away from AI developing software as well as any human, or better. What happens then?

The traditional monolithic applications or coarse-grained microservices that work well for human development teams may not be optimal for AI-powered development environments. Some teams experiment with treating code as completely disposable, letting AI regenerate implementations for each iteration. This works for prototypes and simple applications, but it breaks down for complex systems where you lose accumulated knowledge and performance optimizations.

Components: The Architecture for AI Collaboration

I think the future lies in component-based architectures that provide the right granularity for AI systems to work effectively. This draws inspiration from earlier component models like Microsoft’s COM objects, adapted for modern cloud environments and AI capabilities.

Applications would be built from well-defined components with stable interfaces and clear functional boundaries. Each component handles specific capabilities – user authentication, payment processing, data transformation, content generation – with explicit contracts for inputs and outputs. The critical insight is that while these interfaces remain stable, AI systems can continuously optimize, refactor, or completely reimplement the internal logic of individual components.

This architecture offers several advantages for AI-powered development. Components provide bounded problem spaces where AI systems can operate effectively without breaking broader system functionality. The stable interfaces enable comprehensive testing and debugging, while internal flexibility allows for continuous optimization based on performance data and changing requirements.

What This Looks Like in Practice

Over the next five to ten years, I expect we’ll see component registries emerge that catalog available functionality with detailed specifications. AI systems will continuously monitor component performance and generate optimized versions for testing and gradual deployment.

Applications will become more dynamic, automatically reconfiguring by swapping component implementations based on load patterns, user behavior, or resource availability. Unlike current microservices managed by human teams, these components would be maintained by AI systems operating within architectural guidelines defined by human engineers. And eventually, we may even let the AI take that over too.

The development process shifts toward interface-first design, where human architects focus on defining component boundaries and interactions, while AI systems handle implementation details. This division of labor plays to respective strengths: humans excel at architectural thinking and business requirements, while AI systems optimize implementations and handle routine coding tasks. And as the AI gets better and better at architecture and business requirements development, we may see a whole new world emerge.

The Transition Path Forward

This transformation is happening gradually but accelerating quickly. Professional developers are becoming more effective through AI assistance while maintaining architectural oversight. Non-developers are building functional applications through conversational interfaces that would have been impossible for them to create previously.

Current service-oriented architectures provide a foundation that can evolve toward component models as AI capabilities mature. Organizations with good interface design practices, comprehensive testing strategies, and strong observability will be best positioned for this transition.

The engineers who thrive will be those who can think architecturally about system design while effectively directing AI systems. Product managers become even more critical because rapid prototyping capabilities make clear product vision, competitive intelligence, customer-centric approaches and market understanding the primary competitive differentiators.

The Strategic Reality

As we move toward this future, competition shifts in important ways. Technical barriers to building certain types of software continue falling, but success increasingly depends on architectural excellence and product strategy rather than implementation speed alone.

Organizations that can design effective component architectures and orchestrate AI development systems will gain significant advantages in both development velocity and system reliability. The ability to continuously optimize software systems without traditional refactoring risks could become a major competitive edge.

However, this also presents new challenges around managing dynamic system complexity, ensuring security across AI-generated code, and maintaining coherent user experiences across rapidly evolving implementations.

The transformation isn’t about replacing human engineers – it’s about creating new collaboration models between human architectural thinking and AI implementation capabilities. The future belongs to organizations that can effectively combine these strengths while maintaining clear product vision and strategic focus.

We’re witnessing a shift from static implementations toward dynamic, continuously optimizing systems. While full realization is still years away, the foundation is being built through current experiments with AI-assisted development, vibe coding platforms, and component-based architectures. Replit’s growth numbers suggest this isn’t theoretical anymore – it’s happening faster than most of us expected, and the organizations preparing now will be best positioned to capitalize on the opportunities it creates.

Tagged , , , ,

AI Agents via CharmIQ have Supercharged my work

by Eric Picard

Today I’m exploring a tool that stands out in its ability to harness AI potential in ways I haven’t seen before—CharmIQ. This article discusses the capabilities of CharmIQ and is itself crafted using the platform. I’ve spent over twenty-five years building products and leading teams. I’ve witnessed technology evolve from multiple perspectives, helping guide both startups and large corporations through innovation challenges. This experience gives me a nuanced view on tools that genuinely transform how we work.

The artificial intelligence landscape, especially regarding large language models (LLMs), continues to expand rapidly. Each model offers distinct strengths and weaknesses. Navigating these options can overwhelm even experienced practitioners. CharmIQ addresses this challenge with a document-based approach rather than the chat interfaces that dominate the market. This represents more than a superficial interface change. It fundamentally changes how we interact with AI, allowing these capabilities to integrate more naturally into existing workflows.

CharmIQ’s power comes from its ability to create specialized AI agents called Charms. These agents function as virtual team members, each bringing unique perspectives to problem-solving. Every Charm possesses specialized knowledge and capabilities, enabling me to handle tasks that traditionally required teams of experts. These Charms act as cognitive partners, working alongside you whether generating strategic solutions, refining methodologies, or creating content.

Creating a Charm takes minimal effort. You simply describe the type of expert you need, and the system uses another Charm to write the definition for your new agent. The process typically takes 2-3 minutes to create an effective Charm. The output from different Charms varies significantly. Asking two different Charms the same question produces distinctly different answers.

CharmIQ has integrated with virtually every commercially available and open-source LLM. These integrations enable power users to leverage specific strengths of different models. Anthropic’s models excel at writing content. ChatGPT leads in creativity and reasoning. CharmIQ lets users switch between these models seamlessly, ensuring the right tool for each task.

This capability allows small teams to operate with the capacity and expertise of much larger organizations. By enabling AI-enhanced collaboration among human team members and AI-based agents, CharmIQ democratizes access to advanced capabilities. This matters significantly in today’s competitive landscape where agility drives success.

Consider a practical scenario: a product manager developing a go-to-market strategy. Traditionally, this involves coordinating across departments, synthesizing input from market research, sales, and engineering, and iterating through numerous drafts. With CharmIQ, the product manager can use specialized Charms for each step. One Charm analyzes market trends and customer insights, while another drafts a comprehensive go-to-market plan. This approach saves time and enhances quality by incorporating diverse perspectives.

The document-centric approach means all interactions, feedback, and iterations exist within a cohesive workspace. This eliminates friction associated with switching between tools and interfaces. The result? A streamlined workflow that lets teams focus on innovation and value delivery.

CharmIQ’s collaborative features extend beyond AI-human interaction to include real-time collaboration among team members. This proves invaluable in today’s remote and hybrid work environments. Team members work together within the dynamic workspace, sharing insights, providing feedback, and iterating on ideas without the constraints of traditional interfaces.

CharmIQ stands out not just for technical capabilities but for its strategic vision. By allowing users to customize and deploy multiple AI personas, it fosters creativity and experimentation. Users aren’t limited to predefined workflows. They can explore new approaches, test hypotheses, and iterate on solutions in real-time.

This flexibility benefits organizations trying to stay ahead in rapidly evolving markets. CharmIQ provides tools to adapt quickly to changing conditions, identify emerging opportunities, and respond with agility. For entrepreneurs and startups, this capability can determine success or failure.

As someone immersed in product management, I recognize the importance of aligning technology with business objectives. CharmIQ accomplishes this by providing a platform that enhances productivity, fosters collaboration, and drives innovation. It lets users focus on strategic thinking and high-quality work while reducing time spent on repetitive tasks.

By shifting from chat-based to document-centric interactions, CharmIQ redefines how we work with AI and each other. Its ability to integrate multiple AI models and create specialized agents offers remarkable flexibility and power. For teams of all sizes, CharmIQ enables AI-driven collaboration that unlocks new productivity and creativity levels.

I use this tool for hours daily. It saves me at least 1-2 days of work every week when generating work product. Incorporating CharmIQ into my workflow has boosted my productivity by 10-30x. If I primarily created documents or wrote code, I believe it would approach a 100x multiplier.

I’ve advised their team and CEO since they released their first internal beta. As they expanded access, they quickly discovered its broad appeal. Everyone who uses it becomes an avid power user within days.

If you want to try it, they offer a free trial. However, to fully understand its capabilities, I recommend signing up for a paid plan that unlocks all features. Use it for a few consecutive days, and you’ll likely find it transforming how you work.

Some personal and professional use cases:

I’ve introduced CharmIQ to all my teams and watched those who adopt it transform their workflow and approach. This transformation spans product managers, marketing teams, writers, and software developers. The software architect Charms I’ve created have educated teams on best practices and streamlined product launches.

Professional Example:

While leading Technology at Bark, we launched a new mobile app in just months using React Native. We accomplished this with two full-time developers (neither had used React Native before), plus a fractional team of one product manager and one QA specialist. From kick-off to launching both iOS and Android apps took three months. We used CharmIQ to create Charms that amplified each team member’s work: Market Research, Competitive Analysis, Product Strategy, Go-to-Market Plans, Architecture Design, Software Development Environment Configuration, and code writing and testing.

Personal Example – Writing:

As an author, I had spent three years writing a novel but got stuck with about 100 pages remaining. I couldn’t organize the final chapters and remained completely blocked for almost a year. I pasted my novel into CharmIQ and created Charms based on my favorite authors. I included my outline and what I had written so far, then asked these author-Charms (Neal Stephenson, Neil Gaiman, C.J. Cherryh, Frank Miller, Stephen King, Cormac McCarthy, Joan Didion, and Ian McEwan) for detailed feedback and help refining the remaining outline.

Their initial feedback proved harsh but broke my creative block. I finalized the outline in one night and wrote the remaining 100 pages as a first draft in about two weeks. After gathering human feedback, I made major revisions that continue today. My writing process now deeply incorporates CharmIQ.

Personal Example – Health:

I’m in my 50s with several long-term managed health issues, I’ve created a repository for all my medical data including test results, visit summaries, scans, and reports. I’ve created Charms representing my doctors: Primary Care Physician, Neurologist, Cardiologist, and Vascular Surgeon. The feedback from these Charms mirrors what my actual doctors tell me. I can also have them collaborate with each other, something nearly impossible in real medical practice.

I’ve created additional Charms for recipes, mixology, veterinary advice, restaurant recommendations, vacation planning, plumbing, HVAC, electrical work, home automation, and more.

Give Charm a try using this affiliate link and I’ll get a small financial bonus (anyone can sign up to be an affiliate – I’m just beta testing this for them.) They’re a great team of wonderful humans, and they’ve built a product that has changed everything for me.  I’d cry if it went away.

Tagged , , , ,

How AI is actively transforming advertising

by Eric Picard

Artificial Intelligence (AI) and its subset, Machine Learning (ML), have been integral to the advertising industry for decades. These technologies have transformed how businesses connect with consumers, optimizing many of the dollars spent and ensuring targeted engagement. But as AI evolves, particularly with the advent of Large Language Models (LLMs) and generative AI, we’re entering a new era where smart automation is a reality. Let’s explore how AI, ML, and LLMs are reshaping advertising, untangling the complexities of these technologies, and understanding their distinct roles and applications.

To begin with, it’s important to differentiate AI and ML. AI is the broader concept of machines performing tasks that typically require human intelligence, such as decision-making and pattern recognition.

Machine Learning (ML) is a subset of AI, and has been the engine that powers many of the smart decisions in today’s advertising landscape. At its core, ML is about teaching computers to learn from data, much like how we learn from experience. There are several approaches to ML, each with their own unique applications and strengths in advertising.

One of the most prevalent techniques is supervised learning. This approach involves training an ML model on a dataset that includes both inputs and the correct outputs—sort of like coaching a sports team with the playbook in hand. In advertising, supervised learning is often used for predictive targeting. By analyzing historical data, these models can forecast which segments of an audience are most likely to respond to a specific ad campaign. This allows advertisers to allocate resources more effectively and maximize return on investment.

Unsupervised learning, on the other hand, is a bit like sending a detective into a room full of clues without any instructions. The model explores the data, finding patterns and relationships on its own. This technique is ideal for audience segmentation, helping advertisers discover new and potentially valuable consumer groups based on shared behaviors or characteristics. It’s akin to discovering hidden subcultures within a larger community, providing insights that can drive more personalized marketing strategies.

Reinforcement learning is another fascinating ML approach, where models learn by trial and error—similar to training a pet with rewards and consequences. In the dynamic world of real-time bidding, reinforcement learning algorithms adjust bidding strategies on the fly, learning which actions yield the best results. This adaptability is crucial in environments where market conditions and consumer behavior can change rapidly.

It’s also worth mentioning neural networks, a type of ML model inspired by the human brain’s structure and function. These networks are particularly powerful in tasks involving complex pattern recognition, such as image and speech recognition. In advertising, neural networks can enhance programmatic buying by evaluating vast datasets to identify subtle patterns in consumer behavior, enabling more precise targeting and personalization.

While these examples illustrate some of the common ML techniques used in advertising, the field is vast and continually evolving. Each method brings its own set of tools to the table, contributing to a more nuanced and sophisticated advertising ecosystem. As ML technology advances, its role in crafting more targeted, efficient, and engaging advertising experiences will only grow, pushing the boundaries of what’s possible in the digital marketing space.

Now, let’s address the role of LLMs in advertising. LLMs, such as GPT-4o, are advanced AI models that excel in understanding and generating human-like text. They are not primarily designed for data analysis or real-time decision-making—which are traditional ML use cases—but rather excel in tasks that involve language processing and text-based interactions. LLMs are being leveraged to automate processes that require a nuanced understanding of language and context, such as drafting personalized ad copy, facilitating customer service interactions, and enhancing chatbots.

In media buying and selling, LLMs are being applied to automate complex processes that traditionally required human intervention. By programming LLMs to think with a specific viewpoint and set of instructions, they can streamline tasks like scheduling and orchestrating campaigns, crafting and refining ad messages, and even generating comprehensive reports. These models act more as strategic partners, assisting human teams in managing and executing processes efficiently.

The application of ML in advertising remains robust, focusing on data-driven decision-making. ML algorithms excel in predictive targeting, analyzing vast datasets to identify optimal audiences, and optimizing bids in real-time to maximize ad spend efficiency. These capabilities are essential for real-time bidding environments and dynamic pricing models, where decisions must be made swiftly and accurately based on ever-changing data inputs.

Generative AI, closely related to LLMs but with distinct applications, is making significant impacts in creative advertising. While LLMs are adept at processing language, generative AI models are designed to create new content—be it text, images, or even video. In advertising, generative AI can automate the creation of ad visuals or video content, generating variations tailored to different audience segments. This capability not only enhances creativity but also accelerates the production process, allowing for rapid experimentation and iteration.

The distinction between generative AI and LLMs is important. While both can be used in the creative process, LLMs focus on language and dialogue, whereas generative AI extends to producing varied forms of media content. Together, they offer a comprehensive toolkit for advertisers looking to innovate and engage audiences more effectively.

As AI continues to advance, ethical considerations around data privacy and algorithmic bias abound. Transparency in how AI systems operate and make decisions is essential to maintain consumer trust. Balancing automation with human creativity and insight is also crucial. While AI can handle data-driven processes, human expertise is irreplaceable in crafting compelling narratives and understanding the subtleties of consumer emotions.

Looking ahead, the fusion of AI technologies promises even more sophisticated advertising solutions. We can anticipate AI models that predict consumer needs with remarkable precision, integrating seamlessly into the consumer journey. This future of advertising is not just about efficiency—it’s about creating meaningful, anticipatory experiences that resonate with consumers on a personal level.

By automating routine tasks and augmenting human capabilities, AI is enabling advertisers to deliver more personalized, effective, and efficient campaigns. The key to success will be embracing AI as a partner, leveraging its strengths while preserving the creativity and empathy that only humans can provide.

Tagged , , , ,

The state of the art in using AI to create images

By Eric Picard

Update: 10/29/2024

I recently reread this article I wrote back in May of 2023 when MidJourney 5.1 was released. With the rapid development happening in this space, that makes this archaic a year and a half later with MidJourney 6.1 the current version.

Two things have happened… 1. The model has gotten much better. 2. I’ve gotten better at writing prompts. I decided to try to solve this one from scratch, ignoring how I did it the first time, and here is the prompt I wrote:

Create a photograph of a man and woman posing for the camera holding each other : : The scene is a new england field with long grass and brightly colored wildflowers : : there are fieldstone walls in the background and a row of trees at the back of the field showing fall colors in red and yellow : : The couple both wear thick white cable knit sweaters and are in their mid-forties : : The woman is a beautiful, tall and willowy : : The man is clean shaven, rugged and handsome : : A house is positioned off in the distance, with rolling hills in between the couple and the house : : There is depth in the photograph, the composition is reminiscent of wyeth

And here are the 4 images it spit out without any additional work:

And without altering the prompt at all, within 5 minutes I got to what I see as an acceptable image:

I did of course, being me, play around quite a bit. I wanted to try curly hair and straight hair, vary their age slightly, bearded vs. clean shaven, contrasting sweaters, etc… So here are a few quick examples for you.

As you can see, this is getting a lot better and a lot easier. Read on for a bit of history.

Original Article from May 2023

This article is a discussion of the current state of generative image AI software, and how simple prompts can lead to beautiful images, but trying to get what you’re looking for exactly, and controlling for that is quite involved, and requires some tricks to really manage the AI’s interpretations.

Note: For this article, I’m using MidJourney Version 5.1.  You will see in the text prompts some commands that are unique to MidJourney, the most obvious being –v 5.1 which is the command telling the MidJourney AI which version to use.  5.1 is the newest model supported by MidJourney as of this day (Friday, May 19th, 2023).  There are lots of other generative AI tools out there, but personally I’m finding that MidJourney gives me the results I prefer over the others. Also, some of the prompts have been edited for clarity and consistency, but not in a way that affects the output. None of the images have been edited.

The idea I have for this project is to create an idealized image of a couple in their forties, standing in a New England field, arm in arm, with a colonial house in the background. The vision I have for this is reminiscent of an Irish Spring or Old Spice commercial from the 1970s, but set today. So I’m going to be including things in the prompt that aim the AI at recreating some of the sense of this from back in that time.  Here is what I would write as a prompt as a starting point if I were just doing this for myself:

“a photograph of a couple posing for the camera holding each other, they both wear thick white cable knit sweaters and are in their mid-forties. She is a beautiful, tall and willowy woman, he is clean shaven, rugged and handsome, they stand in a new england field with long grass and brightly colored wildflowers, in autumn, they look towards the camera, there are fieldstone walls in the background and a row of trees at the back of the field showing fall colors in red and yellow, and a small colonial house visible in the distance. There is depth in the photograph, with the house being positioned off in the distance, with rolling hills in between the couple and the house.”

I have an idea that this would provide me with decent results just off of a prompt, but I know well enough that I likely would need some image prompts, as well as text to get to my ideal end result. Including some visual instructions to the AI that would make it likely to include all the elements I care about. Also, I know roughly what I’m looking for in the way the characters look. I’d love the man to look like James Purefoy in his role in Fisherman’s Friends, but more clean shaven, maybe just stubble.  And the woman in my mind looks like Rachel McAdams dressed casually, or maybe Kate Mara with her aged makeup from the new show she’s in “the Class of 09”.

But rather than taking all my accumulated knowledge of how to write prompts, I’m going to start out simple, because you’ll see that the first set of images that MidJourney are quite beautiful, but don’t meet my initial vision.  

Starting with a simple prompt gets me this group of four images:

1:20 PM

a man and a woman stand in a field with long grass and wildflowers –v 5.1

Well – the field looks kind of like what I wanted, but none of the other background components are there, and the two figures are nothing like what I want.  I also know that trying to tune the whole image with two figures gets complicated, so I’m going to retrench and just start with a single figure, and I’ll use the man first.  I’ll start tuning my prompts until I start getting closer to what I want.

[1:21 PM]

a man stands in a field with long grass and wildflowers –v 5.1 –  

This is a good starting point, let me start tuning the man to get closer to my vision:

[1:23 PM]

a photograph of a tall man with broad shoulders wearing jeans and a sweater stands in a new england field with long grass and wildflowers in autumn, he looks toward the camera –v 5.1 

Okay – not quite what I’m looking for, but we’re getting there. Let me tune the man and the setting he’s in a bit:

[1:25 PM]

a photograph of a handsome and rugged tall man with broad shoulders, wearing jeans and a thick cable knit white sweater stands in a new england field with long grass and wildflowers of many colors in autumn, he looks towards the camera, there are stone walls in the background and a row of trees and a small house visible in the distance. –v 5.1 –  

That’s better but he’s not quite right, and I want more depth in the image. 

 — Yesterday at 1:28 PM

a photograph of a handsome and rugged tall man with broad shoulders who looks like a James Purefoy, wearing jeans and a thick cable knit white sweater stands in a new england field with long grass and wildflowers of many vibrant colors in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –v 5.1 

Okay, not really better. Yes, one of those images looks like James Purefoy, but I lost all the other elements that I love. Sometimes focusing too much of the AI’s attention on one element loses you the rest. Also, the aspect ratio of the image isn’t very cinematic, so I’m going to set the aspect ratio going forward to a 2:1 ratio using the –ar command: 

[1:30 PM]

a photograph of a handsome and rugged tall man with broad shoulders, wearing jeans and a thick cable knit white sweater stands in a new england field with long grass and wildflowers of many vibrant colors in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1 –  

Okay – this is getting much closer, but he’s too young. I want to tune his age:

[1:31 PM]

a photograph of a handsome and rugged tall man in his forties with broad shoulders, wearing jeans and a thick white cable knit sweater stands in a new england field with long grass and wildflowers of many vibrant colors in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1 –  

Okay, a bit more age specific, but I don’t have my white cable knit sweater, and I see that the AI is ignoring my request for jeans. Maybe if I remove jeans, the sweater will resolve?

[1:34 PM]

a photograph of a handsome and rugged tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1 –  

That didn’t really help.  Let’s try specifying the color of the house and see if that helps. Also, these guys are all getting beards, so let’s clean that up:

[1:35 PM]

a photograph of a handsome and rugged clean shaven tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small white colonial house visible in the distance. –ar 2:1 –v 5.1 –  

Alrighty, the clean shaven prompt helped, but the white house request isn’t helping. And I still don’t have a white cable knit sweater.  Let’s see what happens when I include the “Stylize” command. This lets the AI be more creative in how it executes. The Stylize command also has a range of settings from 1 – 1000. I’ll put it on the highest setting to see what the difference is:

1:37 PM

a photograph of a handsome and rugged clean shaven tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –stylize 1000 –v 5.1 –  

Well, you can see that the AI was a bit more creative in the layout, but it didn’t get us what I want. Now it’s time to get really aggressive with the background. I’m going to put in some images to illustrate aspects of what I like.  I’ll use these three images for all the rest of the renderings, but here they are for your review:

This really helps the AI see that I want depth in the image, also what I’m looking for with a line of trees in the distance. It doesn’t really help with the wildflowers though, nor with the stone walls. So I’ll add in more images:

Let’s see how that helped, keeping Stylize turned on:

1:45 PM

https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a handsome and rugged clean shaven tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –stylize 1000 –v 5.1 –  

That was much better as far as putting depth and the wildflowers and the walls in – but let’s see if it works better with Stylize off:

[1:46 PM]

https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a handsome and rugged clean shaven tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1 –  

Okay, so we’re getting somewhere.  I like the background quite a lot, although the figure is getting buried a bit.  Let me try this same set of images with a woman instead.

[1:48 PM]

https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a beautiful and willowy woman in a thick white cable knit sweater in her forties, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1 –  

That’s good – I’m seeing a consistent treatment of the setting, but a lot of play with the figure. Let’s see what Stylize does.

[1:50 PM]

https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a beautiful and willowy woman in a thick white cable knit sweater in her forties, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –stylize 1000 –v 5.1 –  

That didn’t seem to make a lot of difference.  So now I’m going to start playing with the figure and see if I can tune it towards what I want.  Let’s start with a general figure that’s the right age and wearing the sweater I want:

 — Yesterday at 1:52 PM

photographic portrait of a Handsome Man, cleanshaven and rugged looking, in a thick white cable knit sweater –v 5.1 

Well, that’s not James Purefoy, but I think the 3rd image looks pretty good. I’ll render that out and look at it larger:

1:55 PM

photographic portrait of a Handsome Man, cleanshaven and rugged looking, in a thick white cable knit sweater –v 5.1 – Image #3  

Now that I have an image that gets more of the visual information to the AI, I’ll drop that into the first slot of my prompt:

2:00 PM

https://s.mj.run/ePXrbTX226c https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a handsome and rugged clean shaven tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1

And this is much better. Really close to what I’m looking for.  Now I just need to find a woman to help tune the image.  Here’s my first try on that.

[2:02 PM]

beautiful and willowy woman in a thick white cable knit sweater in her forties –v 5.1   

I find that MidJourney struggles a bit with age, particularly women.  I’d say these women look older than mid-forties. More like fifties or even early sixties.  But for now, it should at least set the tone.

 2:03 PM

beautiful and willowy woman in a thick white cable knit sweater in her forties –v 5.1 – Image #3  

[2:05 PM]

https://s.mj.run/DDmJPXMNGS0 https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a beautiful and willowy woman in a thick white cable knit sweater in her forties, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –v 5.1 –  

Great – this gets us pretty close to what I’m looking for, but let me try both the male and female version of these with Stylize maximized and see if it helps…

[2:09 PM]

https://s.mj.run/ePXrbTX226c https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a handsome and rugged clean shaven tall man in a thick white cable knit sweater in his forties with broad shoulders, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –stylize 1000 –v 5.1 –  

[2:10 PM]

https://s.mj.run/DDmJPXMNGS0 https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a beautiful and willowy woman in a thick white cable knit sweater in her forties, stands in a new england field with long grass and brightly colored wildflowers, in autumn, he looks towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –stylize 1000 –v 5.1 –  

I think that actually helped a bit, but not a ton. Now we’re ready to tune our image and text prompts to get closer to our final version.  I’m going to create a new prompt just to get the couple generated in a way that will help the AI.  As you’ll see in a moment, even with an explicit prompt, MidJourney really wants to blend two input images into a single image, especially with people.  So I know before I do this that I’m going to end up with a few different blends, but usually at least one image in the quad will follow the instructions. For this I’ve taken my two individual images of the man and woman, and dropped them in just to see what happens:

2:14 PM

https://s.mj.run/ePXrbTX226c https://s.mj.run/DDmJPXMNGS0 photographic portrait of a couple in their forties arm in arm, he is clean shaven rugged and handsome in a thick white cable knit sweater, she is tall and willowy wearing a fall colored sweater –v 5.1 –  

Okay – that was three blended single figures, and one that matched more what I was looking for. So I’ll take that one and see what happens.

 2:15 PM

https://s.mj.run/ePXrbTX226c https://s.mj.run/DDmJPXMNGS0 photographic portrait of a couple in their forties arm in arm, he is clean shaven rugged and handsome in a thick white cable knit sweater, she is tall and willowy in her late thirties wearing a fall colored sweater –v 5.1 – Image #2  

Just so that I can tune this, let’s try just a text prompt and see what we get with no reference images, because this isn’t 100% what I’m looking for:

[2:16 PM]

photographic portrait of a couple in their forties arm in arm, he is clean shaven rugged and handsome in a thick white cable knit sweater, she is tall and willowy wearing a fall colored sweater –v 5.1 –  

Remarkably, I think not using reference images gave me a couple I could use here that is much closer to what I was looking for – with the third image.  

2:17 PM

photographic portrait of a couple in their forties arm in arm, he is clean shaven rugged and handsome in a thick white cable knit sweater, she is tall and willowy in her late thirties wearing a fall colored sweater –v 5.1 – Image #3  

[2:20 PM]

https://s.mj.run/F9kR5x8XeRM https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a couple in their forties wearing thick white cable knit sweaters, posing for the camera holding each other, she is a beautiful, tall and willowy woman , he is tall clean shaven, rugged and handsome, they stand in a new england field with long grass and brightly colored wildflowers, in autumn, they look towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance.  –ar 2:1 –v 5.1 –  

Now we’re cooking with fire.  I really like the first image, but there’s no house in it.  Let’s see if Stylize helps.

[2:26 PM]

https://s.mj.run/F9kR5x8XeRM https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a couple in their forties wearing thick white cable knit sweaters, posing for the camera holding each other, she is a beautiful, tall and willowy woman , he is tall clean shaven, rugged and handsome, they stand in a new england field with long grass and brightly colored wildflowers, in autumn, they look towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance.  –ar 2:1 –stylize 1000 –v 5.1 –  

Okay – that second image in the upper right is just about perfect.  I’d like it if he was wearing a sweater with a bit more texture and thickness.  And the AI has ignored my original request to have her in a cable knit sweater too.  But I like it even better.  So after all that, we have our final image:

2:27 PM

https://s.mj.run/F9kR5x8XeRM https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a couple in their forties wearing thick white cable knit sweaters, posing for the camera holding each other, she is a beautiful, tall and willowy woman , he is tall clean shaven, rugged and handsome, they stand in a new england field with long grass and brightly colored wildflowers, in autumn, they look towards the camera, there are stone walls in the background and a row of trees showing fall colors in red and yellow, and a small colonial house visible in the distance. –ar 2:1 –stylize 1000 –v 5.1 – Image #2

As you can see, this whole process from start to finish took just over an hour, and required me to think a lot about what I wanted, and to be willing to iterate quickly.  I also could have shortened that experience by starting with my very first prompt and iterating from there.  If I go back to that very first prompt, you’ll see that it gives us something good – but doesn’t really get me to what I had envisioned.

a photograph of a couple posing for the camera holding each other, they both wear thick white cable knit sweaters and are in their mid-forties. She is a beautiful, tall and willowy woman, he is clean shaven, rugged and handsome, they stand in a new england field with long grass and brightly colored wildflowers, in autumn, they look towards the camera, there are fieldstone walls in the background and a row of trees at the back of the field showing fall colors in red and yellow, and a small colonial house visible in the distance. There is depth in the photograph, with the house being positioned off in the distance, with rolling hills in between the couple and the house. –ar 2:1 –v 5.1

These are all very nice images, although we see MidJourney having its age issue again, and not listening well to the clean shaven input.  We’ve also lost a lot of what we gained from using the reference images, so I’ll drop those back in:

https://s.mj.run/F9kR5x8XeRM https://s.mj.run/6oUsxj-acqs https://s.mj.run/M33VDlt9f2k https://s.mj.run/dKDRmKwbioI a photograph of a couple posing for the camera holding each other, they both wear thick white cable knit sweaters and are in their mid-forties. She is a beautiful, tall and willowy woman, he is clean shaven, rugged and handsome, they stand in a new england field with long grass and brightly colored wildflowers, in autumn, they look towards the camera, there are fieldstone walls in the background and a row of trees at the back of the field showing fall colors in red and yellow, and a small colonial house visible in the distance. There is depth in the photograph, with the house being positioned off in the distance, with rolling hills in between the couple and the house. –ar 2:1 –v 5.1 

This is much better, and I could iterate on this to really get to an image that is as good as the one we ended up with from the longer process without as much work.

This is where the state of play is today in Generative AI for images.  As a product person, I have a lot of ideas for what needs to happen from a design tools perspective to really supercharge the process.

For instance, I should be able to spend a lot of time generating a single “entity” like “Man in white cable knit sweater” and use that single entity over and over – without having it change each time.  I should be able to generate a landscape, get it perfect, and then drop other entities into it. Today that isn’t possible, but you can imagine that this would be a game changer.  

I’ve spent a lot of my career building design tools and working in advertising, so I know pretty intuitively what a designer needs in order to live up to the requirements of working with clients.  The need to get creative approval on the specific characters (the exact face, the exact sweater, the exact color sweater) are all things that the client would want to approve.  Today with generative AI each time you render that prompt, even with really good images input into the model, it’s very hard to get consistent results.  

But you probably can now really get a sense of what the future holds.  AI powered design tools are going to change everything.  And a lot of careers are going to morph over the next few years.

Tagged ,