In context: Midjourney v6 has arrived as a Christmas gift for AI enthusiasts. This latest version of the image generator promises more realistic images, added functionality, and addresses some significant shortcomings of the tool. However, experienced users may need to re-learn a few things.
The sixth iteration of Midjourney is now available to all users. It only took a few hours after launch for social media to be flooded with images showcasing its improvements.
To use Midjourney v6, simply type “-v 6” after any prompt (remember Midjourney works via Discord). Users can also activate the update by entering “/settings” in the Midjourney Discord server or sending a direct message to the server’s Midjourney bot and selecting V6 from the dropdown menu.
MidJourney v6 is a lot better at including words in images �”�
Here are a few examples.
Prompts in the ALT! pic.twitter.com/EAGdq65hEZ
– Ammaar Reshi (@ammaar) December 21, 2023
Graphic designer Julie Wieland likened Midjourney v6 to an indie project evolving into a Hollywood production, praising the enhanced lighting effects. Other users have posted numerous realistic images, some indistinguishable from hand-edited work. Although errors still occur, finding them seems to take longer with each new version, which is both fascinating and concerning.
Midjourney’s development throughout the ~1.5 years pic.twitter.com/slfnIbDpXW
– Vensy (@vensykrishna) December 21, 2023
Wieland also noted that the updated prompting system required her to reevaluate her approach to using Midjourney. The developers claim that the tool’s natural language understanding has improved. User Tatiana Tsiguleva noted that prompts now need clear indications of style, subject, setting, composition, and other elements.
midjourney v6 really does feel like the indie production evolved to a hollywood production �’�
midjourney v6 + magnific + lightroom pic.twitter.com/Akq86PpxuS
– Julie W. Design (@juliewdesign_) December 21, 2023
A notable new feature of Midjourney v6 is its ability to render legible text. Previously, garbled words were a common sign of AI-generated images. Now, users are sharing pictures with text in various styles, such as neon signs or chalk, demonstrating the tool’s proficiency in this area. This improvement also enables Midjourney to accurately recreate logos of well-known brands like McDonald’s or Coca-Cola.
The image below is a fairly good example of both how far Midjourney has come and of the remaining shortcomings of AI image generation. It convincingly depicts a fictional Netflix series poster starring Leonardo DiCaprio as Vladimir Lenin, even accurately rendering the title and Netflix logo, which earlier versions couldn’t do.
However, an authentic Netflix poster likely wouldn’t use the same font for the word “Netflix” under the title. Additionally, the second shot, depending on creative decisions, might not feature the Russian politician’s name in the Latin alphabet. It’s uncertain whether Midjourney v6 can handle non-Latin text.
35mm film still of an ancient Roman marketplace during the day. People in traditional Roman attire are bartering goods, there are stalls with fruits, vegetables, and pottery, and in the background, the Colosseum is visible.
–v 6 (top)
–v 5.2 (bottom) pic.twitter.com/ZHZyRs8MAz– Nick St. Pierre (@nickfloats) December 21, 2023
A comparison of Midjourney v5.2 and v6 in depicting an ancient Roman market illustrates another point. The v6 image appears more authentic than its predecessor’s version. However, both inaccurately show the Colosseum in ruins during ancient Roman times. This highlights generative AI’s ongoing struggle with context and suggests that careful prompting might mitigate such logical errors.
1. Prompt: A man standing alone in a dark empty area, staring at a neon sign that says “EMPTY” pic.twitter.com/LTcDE9T5eB
– Chase Lean (@chaseleantj) December 21, 2023
Another interesting development is multi-panel images. Although AI image generators still face challenges in maintaining visual continuity in event sequences, Midjourney v6 can create a picture with multiple panels, each featuring a different subject or angle.