Ganimeexplained: Ganime Girl Random Sample
Ganimeexplained: Ganime Girl Random Sample
Ganimeexplained: Ganime Girl Random Sample
contents
1 Generation Algorithm 2
1.1 GANime generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Upscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Contract 4
2.1 Backend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Future 5
4 Useful links 5
list of figures
Figure 1 GANime girl random sample . . . . . . . . . . . . . . . . . . . 1
Figure 2 Upscale sample, left one is original, right processed with
SwinIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Figure 3 Tagged examples . . . . . . . . . . . . . . . . . . . . . . . . . . 3
abstract
This paper outlines current technology stack behind the GANime project. Including
machine learning part, software backend and NFT Smart-Contract.
1
generation algorithm 2
Figure 2: Upscale sample, left one is original, right processed with SwinIR
1 generation algorithm
1.1 GANime generation
The main idea behind the algorithm it’s to map some latent space f(z) → x into
images space. This can be done using GAN models. In short, the idea is to take a
bunch of real images and images generated from z and start to train a model which
will distinguish between real and fake ones.
As result, we obtain the model which can make good
realistic images. We will not provide here detailed infor-
mation of how GAN models work training due to many
good persons already did it for us here or here. Also,
we would like to make such space Z controllable. By
controllable we mean that if change a bit our Z variable,
the space will move a bit too f(z + a) -> x + a.
For such a purpose, we decide to utilize the latest
state-of-the-art network called StyleGan3. This network
is not much different from StyleGan2 in terms of image
generation, but for videos, it’s allowed to avoid situations if the texture is losing its
temporal consistency. A great description of all other methods can be found in this
video. In result StyleGan model allows us to make such function that f(z) → x and
using normal distributed random function obtain g(y) → z.
1.2 Upscaling
1.3 Tagging
We not only want to get a high-quality generation but also to detect traits. Traits
very useful, because it the simpliest way to estimate image rarity.
The 700k images used for generator training were utilized in this task too. All
of them were tagged by the community. Some of them are not very accurate or
have crossed classes, like "1girl", "single girl", "single", "one girl". So we select the
1024 most common tags, later we reduced the number up to 512. As a backbone,
the ResNet34 trained on the ImageNet dataset and later finetuned on our dataset.
Finetuned means that original classes used for predic-
tion were replaced with ours. We generated in total
21.5k images to estimate our GAN tags distribution.
Then we manually select the 32 best with good statistical
criteria. As result, we have a model to classify 32 tags
from generated images. Classification is done on small
192x192 images to reduce computation complexity.
1.4 Conclusion
18
19 # downsize for autotagger
20 x_small = resize(x, (192, 192))
21 # detect tags
22 tags = get_tagger_model().run(x_small)
23 # select best
24 tags_best = select_best32(tags)
25
26 # upscale
27 x_big = get_upscaler_model().run(x)
2 contract
There are two common ways to store content in the blockchain:
We found a way to concatenate the best from both worlds. The main idea is to
generate random unique seed using some function like
The seed will be computed at minting time. After this, the URL with the content
will be stored in the contract too. The reveal of content will be not instantly, but
with a delay of up to a dozen minutes, depending on servers loads. This URL
is provided by our backend which runs NN inference using your random seed.
The generation procedure is compute-intensive so we
generate media for you once. Also, we will provide the
source code with instructions on how to run the model
and you can restore the media from your unique seed by
yourself. The model with all sources will be stored on
IPFS storage and code for downloading will be provided
too.
This method allows us to generate NFT itself on-chain
and store assets for reproducing using a third-party
provider. So your NFT always will be on-chain and born
on-chain too. All sources and instructions will be stored
using the IPFS provider.
2.1 Backend
After you mint, the blockchain generates an event with a unique ID. We subscribe
to these events and retrieve that ID.After that, our AI uses this number to generate a
picture. Then, our server places it on IPFS and pins it, receiving a link to the image.
Pinning is an action that tells IPFS that this file doesn’t need to be deleted at the
next "garbage collection" i.e. now the file is guaranteed not to be deleted from the
system. To pin files, we use a popular service pinata.cloud
future 5
3 future
We’re researching the idea of combining the existing NFT tokens a.k.a. latent
vectors. The idea behind how to implement it by classify which latent space re-
sponse for concrete trait and when combining just averaging the sectors or crossover.
Basically, if we do so you can combine 2 ganime girls.
For example the first one with glasses, the second one
with blue hair. The possible result will be blue hair with
glasses.
The next thing is we’re thinking on its style transfer,
i.e. to take an existing image and mutate it and get some
cyberpunk or any other styled image. This mechanic can
be interpreted as some potion as an NFT token.
We plan to give the girls animated facial expressions
and give you ability to control them in your browser.
Our long-term plans include making them talk with you.
In terms of facial animation we’re archived acceptable results but there is still a lot
of work to do.
We will not lie to you. To bring these plans for real it will take 1-2years of full-time
research and programming. All these plans require lots of computational resources
for scientific experiments and model checking.
4 useful links
• Discord
• Beauty contest
• Style transfer 1
• Style transfer 2
• Video 1