model 3 | Notion

https://colab.research.google.com/drive/1_4Jl0a7WIJeqy5LTjPJfZOwMZopG5C-W?usp=sharing

This is a brief tutorial on how to operate VQGAN+CLIP by Katherine Crowson. You don’t need any coding knowledge to operate it - my own knowledge of coding is very minimal.

I did not make this software, I just wanted to bring it to the public’s attention. Katherine Crowson is on Twitter @RiversHaveWings, and the notebook that inspired it (The Big Sleep, combining BigGAN+CLIP in the same way) was written by @advadnoun. In addition, if you’d like to see some really trippy video applications of this technique, check out the videos on @GlennIsZen‘s YouTube Page: https://www.youtube.com/user/glenniszen

Note: I purchased a subscription to Google Collab Pro, which gives priority to better and faster GPUs, and decreases the time taken before Collab times out. This is not a necessary measure, and you can do all of this without it. If you want to run generations for longer periods of time or have it run in the background without interruption, Collab Pro is a reasonable option.

STEP 1: Google Collab

Go to this Google Collab Notebook (Originally by Eleiber and Abulafia using Katherine Crowson’s code, translated into English by @somewheresy):

https://colab.research.google.com/drive/1_4Jl0a7WIJeqy5LTjPJfZOwMZopG5C-W?usp=sharing

Alternatively, the user @angeremlin has developed an alternate version of the notebook which allows for importing from Google Drive, and batch processing:

https://colab.research.google.com/drive/1ud6KJeKdq5egQx_zz2-rni5R-Q-vxJdj?usp=sharing

Google Collab is like Google Docs for code. It gives you access to dedicated GPUs that Google operates through the cloud, so you don’t need to use your own computer’s processing power in order to run the program. This is useful, because generating art with AI can be very intensive.

You’ll see that the page is split into lots of individual “cells.” For the vast majority of these, don’t mess with them. There are only two cells you should be interfering with, “Selección de modelos a descargar” (Selection of Models to Download) and “Parámetros” (Parameters).

STEP 2: Parameters

In the Parameters cell, you enter all the information necessary for VQGAN to create its image. Here’s a breakdown of all the different parameters in the cell:

textos/texts: This is where the written prompt goes, e.g. “a cityscape in the style of Van Gogh,” or “a Vaporwave shopping mall”

ancho/width: The width of the generated image in pixels

alto/height: The height of the generated image in pixels

modelo/model: The dataset the GAN uses to create the image (more on this in a second)

intervalo_imagenes/image_range: How frequently the GAN actually prints the ongoing image for you (e.g. every 50 iterations, every 5 iterations, every 500 etc)

imagen_inicial/initial_image: This is optional. You can insert your own image to give the GAN something to start with, instead of having it generate an image from scratch