Although advanced experiments in Artificial Intelligence (AI) have been around for a long time, for the common people it was usually portayed in a futuristic science fiction movies. Virtual assistants like Google Assistant and Siri etc. that are able to understand natural language queries have been helping us for some time.
Taking a look at AI art generators, that convert a text prompt and produce an image, the most well known ones are DALL-E, Stable Diffusion and Midjourney etc. These are deep (machine) learning models that generate digital images from natural language prompts.
They are Machine Learning models trained on huge data sets and able to recognize and interpret prompts given in natural language. The results are nothing but amazing.
For example, using Midjourney and a prompt
a large pancake with other small pancakes and maple syrup on a light brown background may give us the following images.
It may give a slightly different result another time. The key here is to be more specific and detailed in the prompt that we provide for it to generate the kind of image we are looking for.
With Midjourney, we can choose one of the images and ask the AI to upscale it with enhanced detail.
I had tried similarly with another service, whose name I do not recall now. The same query when used on Unsplash gives us some random orange background images.
The more the users interact and provide information for the system to learn, the better it will be to produce more accurate results. Being new to this, I am only able to come up with very basic prompts but crafting the right prompts might be a job opportunity for those skilled.
Search for AI generated Images
A quick shout out to the service Lexica Art as well, which can be used as a search engine for images generated by AI. It is free and easier to use than some of the other services.
Most of these services are in Public Beta and still fine tuning their algorithms. Some of the services are open source, but mostly limit free usage as huge computational power is required to run these.
Like in the example of the Orange Racing Car from above, Graphic/Web Designers could use these services to make quick mockups and placeholders. It will give a starting point or provide inspiration to ideate a thought process quickly. At a later part, art directors or photographers could visualize how they want to capture a particular scene or aesthetic.
If used properly, these tools could increase productivity and efficiency.
As with all technology, it can be misused or manipulated intentionally or not.
I understand these AI systems are run in a sort of controlled enviroment or
User Mode. There are list of banned words and other checks in place to prevent the AI from creating any inappropriate content.
Some artists have complained that the AI’s have been trained with their work and the AI sometimes generates images in those style even with their signatures.
It may not be a good idea to share personal information at this point, where apps like Lensa is used to generate stylized images from a person’s photos.
Sure it looks great, but what all data is assimilated and which all actors can have access to this. Are we training this AI to label an image with a person’s name, which can possibly be used to identify the same person from a CCTV footage.
- Stable Diffusion
- What is Midjourney
- Lexica Art