Join Leaders in Boston on March 27 for an exclusive night of networking, ideas and conversation. Request an invitation here.
The famous artificial intelligence image generation service Mid-flight It has rolled out one of its most requested features: the ability to recreate characters consistently across new images.
By nature, this has been a major hurdle for AI image generators to date.
This is because most AI image generators rely on “Diffusion models“, tools similar to or based on Stability AI's open source image generation algorithm Stable Diffusion, which works roughly by taking the text entered by the user and trying to piece together a pixel-by-pixel image that matches that description, as we've learned from similar experiments with images and text tags in a group Huge (and controversial) training data comprising millions of human-generated images.
Why consistent personalities are so powerful – and elusive – for generative AI
However, as with large text-based language models (LLMs) like OpenAI's ChatGPT or Cohere's new Command-R, the problem with all generative AI applications is the inconsistency of responses: the AI generates something new for each individual prompt entered into it, even if the claim is repeated or some of the same keywords are used.
VB event
Artificial Intelligence Impact Tour – Boston
Ask for an invitation
This is great for creating entirely new pieces of content – in Midjourney's case, images. But what if you're making a movie, novel, graphic novel, comic book, or some other visual media you like The same A character or characters to move through and appear in different scenes and settings, with different facial expressions and props?
This exact scenario, which is usually necessary for narrative continuity, has been very difficult to achieve using generative AI – until now. But Midjourney is now working on that problem, introducing a new tag, “–cref” (short for “character reference”) that users can add to the end of their text prompts in Midjourney Discord and will attempt to match a character’s face. Features, body type and even clothing from the URL that the user pastes in the next mentioned tag.
As the feature progresses and improves, it could take Midjourney beyond being a cool game or source of ideas into a more professional tool.
How to use Midjourney's new consistent personality feature
The tag works best with previously created Midjourney images. So, for example, the user's workflow would be to create or retrieve the URL of a previously generated character.
Let's start from scratch and say we're creating a new character with this prompt: “Bald, muscular man with a bead and an eyepatch.”
We'll upgrade the image we like the most, then Control-click on it in the Midjourney Discord server to find the “Copy Link” option.
Next, we can write a new prompt in “Wear a white tuxedo while standing in a villa -cref.” [URL]” and paste the URL of the image we just created, and Midjourney will try to create the same character from before in the newly typed setting.
As you'll see, the results are far from matching the original figure (or even our original claim), but they are certainly encouraging.
In addition, the user can control to some extent the “weight” of how closely the new image is produced to the original character by applying the “-cw” flag followed by a number from 1 to 100 to the end of the new prompt (after “-cref [URL]”string, like this:”-cref [URL] – CW 100.” The lower the “CW” number, the greater the contrast in the resulting image. The higher the “CW” number, the more closely the resulting new image follows the original reference.
As you can see in our example, entering “cw 8” very low actually returned what we wanted: the white tuxedo. Although he has now removed our character's signature eyepatch.
Well, there's nothing a “different area” can't fix — right?
Well, the eye patch was put on the wrong eye… but we got there!
You can also combine multiple characters into one using two “–cref” tags along with their respective URLs.
The feature launched earlier this evening, but artists and creators are testing it out now. Try it yourself if you have Midjourney. And read founder David Holz's full note on the topic below:
Hi @everyone here, we're testing the new Character Reference feature today, this is similar to the Pattern Reference feature, except instead of matching a reference pattern, it tries to make the character match the Character Reference image.
How it works
- He writes
--cref URL
After you're prompted for the URL of your profile picture - you can use
--cw
To adjust the “strength” of the reference from 100 to 0 - Strength 100 (
--cw 100
) Default and uses face, hair, and clothing - (strongly 0)
--cw 0
) will only focus on the face (good for changing clothes/hair etc)
What is it meant for
- This feature works best when using characters made from Midjourney images. It's not designed for real people/images (and will likely distort them as normal image prompts do)
- Cref works similarly to regular image prompts except that it “focuses” on character traits
- The accuracy of this technique is limited, it will not replicate dimples/freckles/or shirt logos exactly.
- Cref works with regular Niji and MJ models and can also be combined with
--sref
Advanced Options
- You can use more than one URL to mix information/characters from multiple images like this
--cref URL1 URL2
(This is similar to multiple image or style prompts)
How does it work on alpha web?
- Drag or paste an image into the visualization bar, where it now contains three icons. Specify these groups whether they are an image vector, a style reference, or a character reference. Shift+Select an option to use an image for multiple categories
Remember, although MJ V6 is in alpha stage, other features may change suddenly, but the official beta version of V6 is coming soon. We would love to share everyone's thoughts on Ideas and Features and we hope you enjoy this early release and we hope it helps you as you play in building stories and worlds
VentureBeat's mission It is to be a digital town square for technical decision makers to gain knowledge about transformational and transactional enterprise technology. Discover our summaries.
“Hipster-friendly explorer. Award-winning coffee fanatic. Analyst. Problem solver. Troublemaker.”
More Stories
This $60 Chip Fixes a Long-Standing Super Nintendo Glitch
Google’s New Nest Thermostat Features Improved UI and ‘Borderless’ Display
New York Times Short Crossword Puzzle Hints and Answers for Monday, July 29