Pokémon type classifier using their colors

Hello everybody! Here is my second article on the world of Pokémon.

The previous article was very practical, showing actual advices for the first rival combat. Today, we’ll do something more conceptual and link colors with Pokémon types. As you may know, most of the Pokémon have colors that correspond to their types: blue – water, orange – fire, yellow – electric and so on. Let’s see what we can dig on this matter. I might be wrong on some actual facts, feel free to comment to point out errors.

pokemon_00411999

Other works on Pokémon colors

I’m a big fan of Pokepalettes, a simple website showing palettes of Pokémon – very useful when designing an Excel or Powerpoint document. In the same vein, pie charts of colors are marvelous to look at. There is also a Buzzfeed test where the goal is to guess the Pokémon given only 3 colors.

Sprites used

To analyze the Pokémon palettes, we’ll use the second generation sprites as they are simple, vibrant and shows very limited color gradient compared to more recent versions. Let’s first download the sprites for all 251 Pokémon from Bulbapedia. I chose this website as their backgrounds are transparent unlike the white ones from several other zip files I stumbled upon. For simplification purposes, I only downloaded sprites from the crystal version. Here is the R program used.

crystal

All 251 sprites from the crystal version. Image taken from here.

3D visualization

Now that we have a sample of Pokémon sprites, we’ll let the data speak. First, let’s decide on a color model to visualize the color values on a 3d coordinate system. There are multiple color models to choose from (RGB, CMYK, HSV, …) so let’s use the RGB model as it is simple to understand. Analyzing the sprites previously downloaded, we can extract the primary color (click here for the code). Then, we can plot the primary color on a 3d RGB space (click here for the interactive version and here for the code):

Pokemon Primary Color

Pokemon primary color using the RGB color model. Click here for an interactive version.

Pure black and white were removed as it was the primary color of too many Pokémon and were most of the time not representative of the Pokémon – for example, see Blastoise or Dewgong. Looking at the plot, we can admire how varied colors are in the world of Pokémon . Still, some colors seems less used than others: for example, green Pokémon are scarce and mostly limited to almost pure green tones. Also note the void in the green-blue and green-yellow area. This may be due to the fact that green Pokémon are exclusively insect or grass type Pokémon.

Type color space

Continuing on this Pokémon type questioning, I was interested in separating the color space in Pokémon type regions. We know that blue is for water Pokémon, but what about brown or pink? Let’s analyze that statistically. Since Pokémon can have two types, the primary and the secondary (for example, charizard is Fire/Flying), I’ll analyze them separately. We’ll use a k-nearest neighbors (knn) algorithm to classify points in the color space. As the name says, to classify a point, the algorithm will look at its k nearest neighbors and output the most popular type (Since there can be ties when k>1, a weighted k-nearest neighbors which puts more weight on the closest neighbors will be used instead). To choose the optimal k, there is a built-in method that I won’t describe (leave-one-out cross-validation), but results in k = 10 for the primary type and k = 22 for the secondary type. This means that if we want to know the primary type of a random point in the color space, the algorithm outputs the most popular type of the 10 closest Pokémon (and 22 for the secondary type).

2012-10-26-knn-concept

General concept of the knn algorithm: if we want to classify the point where there is the red star with k=3, we look at the 3 nearest points. Since the majority of the points is from Class B, the point is then classified as B.

Primary type

To better visualize the results, we will look at them using color wheels slices on the HSV (Hue Saturation Value) color space. For V = 1, V = 0.55 and V = 0.3 , we respectively have (code here):

ColorWheelV1T1

Primary type classifier using a 10-nearest neighbors algorithm when V = 1.

Type abbreviation meaning:
GRS: Grass, FIR: Fire, WTR: Water, BUG: Bug, NRM: Normal, PSN: Poison, ELC: Electric, GRD: Ground, FGT: Fighting, PSY: Psychic, RCK: Rock, GHT: Ghost, ICE: Ice, DRG: Dragon, DRK: Dark, STL: Steel, FLY: Flying.

ColorWheelV055T1

Primary type classifier using a 10-nearest neighbors algorithm with V = 0.55.

ColorWheelV03T1.png

Primary type classifier using a 10-nearest neighbors algorithm with V = 0.3.

An interactive version of the graph is available here (takes several seconds to load).

With a quick glance, we can see some obvious palettes linked to certain types:

  • blue -> water, ex:
  • orange, dark red -> fire, ex.:
  • yellow -> electric, ex.:
  • green -> grass, ex.:
  • pink, lavender -> psychic, ex.:
  • magenta, purple -> poison, ex.:
  • dark green -> bug, ex.:
  • dark grey -> rock, ex.:
  • ocre, brown -> ground, ex.:
  • light desaturated colors -> fighting, ex.:
  • dark blue -> dark, ex.:

(I must admit I used a “name that color” app because I’m terrible with colors resembling magenta.)

Further analysis

As the second most popular type (44 Pokémon on 251), normal type finds its way in a big range. There are two major palettes for normal Pokémon, light pink, ex.:

and brown, ex.:

The pink is notably lighter than the “psychic” pink, and the brown is less yellow than the “ground” brown. Worth noting, in Generation VI, most of the pink Pokémon had fairy added as a secondary type, or replaced the normal type.

jgip9ds

Normal Pokémon from the 1st generation were mostly brown or pink

The most popular type in the second generation is the water type, with 46 Pokémon. Strangely, the classifier address some hue of red and pink as water type. Why?

079098224This is because of krabby, slowpoke, octillery

 

and their evolutions 099080199which are examples of non-blue water Pokémon.

 

 

Secondary type

For the secondary type, I put a blank when the Pokémon had no secondary type. This means that the following graphs are mostly blank. Here’s what we can see:

ColorWheelV1T2

Secondary type classifier using a 22-nearest neighbors algorithm when V = 1.

ColorWheelV07T2

Secondary type classifier using a 22-nearest neighbors algorithm when V = 0.7.

With 22 neighbors, the algorithm is very purified. We only see two secondary types, poison and flying. In the two first generation, grass Pokémon often had poison as its secondary type and bird Pokémon were often normal / flying. Ex.:

The interactive version allows to be more precise on this subject.

Test on newer Pokémon

I wanted to know how well the algorithm did with the following generation. Like I previously did, I now downloaded only the new Pokémon (#252 to #386). Here they are (code here):

Generation3

A total of 135 Pokémon were added in the third generation (sprite from here).

The design in this generation is more complex: more advanced color gradation, but also more details. We also see that the colors are more desaturated. With a RGB plot showing the primary color, this means that they are nearer the diagonal joining the pure white to the pure black and much less on the edges:

Pokemon Primary Color 3

Primary color of the new Pokémon in the third generation. Click here for an interactive version.

Now, if we use the previously done k-nearest neighbor model on these new sprites, we have the following results (click on the Pokémon to see their type and what was predicted):

  • 16 Pokémon only had a primary type and it was rightfully guessed:
  • Only 2 Pokémonhad a dual type where both were guessed:
  • 2 Pokémon had their primary type correctly guessed, but not their secondary type:
  • 6 Pokémon had their primary type correctly guessed, but the model predicted no secondary type:
  • 3 Pokémon had their secondary type correctly guessed, but not the primary type:
  • 6 Pokémon had their primary type guessed as their secondary type or vice versa:

Impressively, the model correctly guessed the type of the three starters. Unfortunately, the model was perfectly right 13% of the time and there are multiple reason explaining this. One reason is that there are previously unseen type combinations (grass/dark, ground/dragon, etc.). Another reason, like said previously, is that the sprites show a more complex color gradation. A solution would have been to calibrate the model using the sprites from the same generation. Another solution would have been to adjust the model itself by desaturating all colors in it. To check the prediction of all Pokémon, including the ones that were incorrectly guessed, I put an Excel file on github.

Closing remarks

This concludes this article on Pokémon colors. Let’s recap on what was done. First, we extracted the Pokémon Crystal sprites from Bulbapedia. From the sprites, we plotted their primary color on a 3D RGB space. Then, we used this data to build a model predicting the type using the colors. Finally, we tested the model on the new generation 3 pokémons.

The model should now be tested on newer versions, or compared across different games. Now, there are more than 700 Pokémon to classify,

Leave a comment