top of page

Extracting AI Personality from AI Behaviors Using Particle Swarm Optimization and Artificial Psychosocial Framework

Introduction

    Imagine you are playing Cyberpunk 2077 just like the image shown below. You are shooting against those AI enemies. All the innocent civilians will scream and run away when the fight begins. After you kill all the enemies, suddenly, all those civilians just reappear and pretend nothing happened there. They might even ignore the bodies on the ground. 

Picture1.jpg

    Game developers have tried many different AI implementations to avoid similar situations in the games. Many in-game AIs have complicated behaviors allowing them to respond to different situations differently. However, oftentimes those behaviors are just hard-coded and can look stupid in many special cases. Furthermore, those well-polished behaviors cannot be carried over to other scenarios, levels, or next games. Another problem with hard-coded behavior pattern is it needs to be redesigned whenever there is a new feature introduced. If we want to design a new fancy weapon that can transfer all the enemies into pigs, those AI enemies need to have a new behavior to this weapon in order to avoid players thinking "the AIs are more stupid after updates". 

    If we can have something that represents a fixed pattern of AI behaviors, all the problems above can be solved. If we consider all the AIs are human-like, then a pattern that represents distinct human behaviors is called personality. By applying personality, AIs can have a characteristic pattern of behaviors that are more human-like and reasonable, while also they will follow the exact same pattern for different scenarios and features. For example, in many RPG games, if we can assign personalities to those characters, we can have random events in the game while those characters can always respond correctly to those events. No more hard-coded conditional behaviors. 

Picture2.png

    Now we know that we want to assign a personality to AIs, but the problem is how to represent personality correctly. I will use Big Five Personality, also called OCEAN, as my personality model. As the image showed above, the OCEAN model has five different traits that represent five distinct characteristics: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. Each trait has two directions: high or low, and I defined each trait as a float value from 0 - 1. For example, 0 extraversion means introverted, and 1 extraversion means extraverted.  

    Even though I know what type of personality model I will use, it is vague what will be the value for each trait. Oftentimes we only know a range. For example, we want to build an open NPC, but we don't know what the exact openness value should be. Hence, the goal of my thesis is to extract a specific AI personality based on AI behaviors

     How do we extract personality? The idea comes from general personality surveys which ask about people's actions in different situations. We can conclude a person's personality based on their behaviors. However, one action can have different meanings in different situations, and there are millions of actions we need to test. Directly analyzing behaviors seems to be hard and time-consuming. We can actually summarize the behaviors into likeness/dislikeness.  The image below shows an example. I play video games with my friends; I watch movies every week; I drive to other cities when I have free time. We can conclude those behaviors into my loves and hates: I like playing video games, watching movies, and traveling around. Those loves and hates are also called valence in psychology, the subjective spectrum of positive to negative evaluation of an experience a person may have had, which I will use this term in the following discussion instead. Based on the valence values, we can extract the personality. From what I like and dislike, we can probably figure out what type of person I am. 

WeChat Screenshot_20220315175505.png

    Based on what I have mentioned above, we now have clear ideas of why we need a personality, and how we get a personality. This leads to my thesis, using defined behaviors to figure out a personality that can summarize a pattern from those behaviors. Once we have the personality, we can use it to define more behaviors that will follow the same pattern as previously defined behaviors. Hence, the end goal is to define required behaviors and output a personality.

Artifact

    The image below shows the overall process of my artifact of how it extracts personality from defined behaviors. First, we need to define a behavior fitness function that uses game statistics to evaluate AI behaviors. Then, we put the fitness function into a valence training model which uses Particle Swarm Optimization to train valence values. The model will output a set of valence values. Next, we put the valence values into the second model which will train personalities. As the result, it will output the final personality that we are looking for. 

WeChat Screenshot_20220315181236.png

Behavior Fitness Function

    The behavior fitness function simply evaluates the behaviors of the AI in the game and returns a score about how this AI is performing those behaviors well or not. The image below shows an example of a similar fitness function in League of Legends. Since developers only know what behaviors they want to apply to the AIs, we need to find a method to determine whether the AI is performing the behaviors that we want them to do. For example, if I want to have a tank that will always defend incoming attacks and never retreat, the fitness function will analyze the AI behaviors during the game to see if the AI does defend a lot and does stay in the frontline. If the AI does everything I required, the function will return 1. If AI doesn't do anything I want it to do, then the function will return 0. The return value will be a float from 0-1 representing how well this AI performs the behaviors we want it to do.  

Picture3.jpg

Valence Training Model

    Once we have our behavior fitness function defined, we can put it into the artifact, and the artifact will follow the process and find a personality. More specifically, the artifact will first use this fitness function to find what action this AI likes and dislikes, also called valence values. This process will use a model called the valence training model, which uses an algorithm called Particle Swarm Optimization (PSO).

 

    The image below shows how the model uses PSO to get valence values. Each particle has its own velocity in the solution space. After one move, the particle will put the current position as input valence values into its own simulation game. Then, the game will simulate. Once the game finishes, it will output game information and put it in the fitness function. The fitness function evaluates the AI with the valence values and returns a score. If the score is higher than the local best score in the particle, the particle will update its best score and best valence values. Lastly, it will also compare the score with the global best score in the whole group and update the score if it is higher. 

WeChat Screenshot_20220315201113.png

    This model allows us to find valence values that have the highest behavior score, and the valence values will be the input for the next model which will train the personality. 

Personality Training Model

   This model will train the personality using PSO also. The image below shows how it is applied in the algorithm. We can see that instead of using the fitness function, it will compare the valence values instead. In this process, we want to find a personality that causes the smallest valence change during the game simulation. Why? If the valence values are correct, and it is giving the correct action, the personality should like this action. If there are some valence changes, which means the AI changes its mind, then they are not the correct valence values. We want to find a match where AIs pick an action based on valence values, and the action is also what my personality wants to pick. 

WeChat Screenshot_20220315202542.png

Artificial Psychosocial Framework

   Now we know the overall process of how the personality is extracted, but in both models, they simulate games by applying different inputs. It might be confused that how AIs choose actions and change valence values. The image below illustrates how AIs change their valance values during the game simulation. Artificial Psychosocial Framework (APF) is a framework designed by Jake Klinkert, which allows AIs to generate emotions based on their personalities and valence values. This framework was extended by Vivian Wei, who designed a memory system and added it into the APF. The memory system allows AIs to remember events. Based on the memories and events, AIs will change their valence values based on how good/bad the event's result would be. For example, if the AI always deals less damage using attack, then it might start disliking attacks because it doesn't work so many times. 

WeChat Screenshot_20220315204322.png

   Notice the green block above in the APF graph. It is a new system designed and implemented by me that will also update the valence values after performing an action. It is not part of APF, but it works parallel to the APF and will change the valence values with APF together. 

Valence Updates

   In the valence training model, the game output will be different when we apply different valence data input. The reason is the AIs will choose different actions based on the valence values. The image below illustrates how an AI chooses one action. It will first assign weights to different actions based on their valence values. A higher valence value means a higher chance to be picked. This allows the AI to have a chance to pick all actions instead of picking one forever (unless the valence value is 0). Once one action is chosen, it needs to find a target. The AI will generate expectations from the memory system about this action and this target. Then, it will pick one with the best expectation. For example, The AI now needs to attack one enemy, and there are a tank enemy, a ranger enemy, and a healer enemy. Based on the expectations of attacking a tank, attacking a healer, attacking a ranger, it will choose one with the best result. 

WeChat Screenshot_20220315215549.png

   What is an expectation and how do we get one expectation? The image below shows how an expectation is generated from the memory system. It will recall all the similar memories from the systems, and then assign weights to the memories. The more recent one has a higher weight. Then, we combine all memories together based on the weights and generate a general expectation. For example, the AI will expect to deal 20 damage after attacking a tank. Based on the actual result returned, the AI may be disappointed or happy with the result, which causes valence value changes.

WeChat Screenshot_20220315215643.png

   Now we know how the memory system changes the valence values, but how does the personality changes valence values? The first idea is: we can interpret valance values from the personality. Imagine now we are playing a new game that we have never played it before. After playing it for a while, we will have a general idea of whether we like it or not, which is a valence value. This is the value that our personality tells us. There is a relationship between personality and valence, and oftentimes we can manually define the relationship by ourselves since we design the actions. For example, once we design a new action called range attack, we can define it as an open and extroverted person would use it more often, then we can find the valence value of range attack would be high for an open and extroverted person.

    Once we know the relationships between all actions and all personality traits, we can use them to update the valence value. The image below shows an easy example. Whenever this AI attacks, it will first check what is the valence value interpreted from the personality, and it will compare with the actual valence value. If they don't match, the valence value will be changed towards the value from its personality. The input valence values that we apply at the beginning of this model are like the loves and hates someone else told you, and the actual valence values from your personality are your own loves and hates from your soul. 

WeChat Screenshot_20220315221123.png

   Based on what I have explained above, we can combine two systems together and check what the actual valence change is after each action. The image below demonstrates how the updates are determined by both systems.

WeChat Screenshot_20220315221404.png

Demo

   Now I want to show how this process actually works in a program. Before we extract any personalities from our own characters, I want to first check if we can extract a correct personality from a well-known character. If the personality we got from the program matches the actual personality of the character, then we can say it is working correctly. In my program, I designed and implemented a simple 2D turn-based RPG game that allows AIs to fight against each other. Each side has 4 AIs, and only one AI will be applied with our personality system.

Picture4.png

   The image above shows a famous character, Sylvanas Windrunner, from World of Warcraft. She is the Warchief of the Horde and also the Banshee Queen. We can easily identify her characteristics: competitive, harsh, perfectionist, and a powerful ranger. Based on Magdan Cvitesic's article about characters' personalities in World of Warcraft,  Sylvanas's personality is INTJ in Myers–Briggs (MBTI) personality model. 

    Now we know her personality, but we are using the OCEAN model instead of MBTI. However, we can easily convert INTJ into the OCEAN values. The image below shows how INTJ converts into OCEAN values. 

WeChat Screenshot_20220315222737.png

   Once we have a general idea about her personality, we can start defining the fitness function. Again, we can easily interpret it from her characteristics. The image below shows the final fitness function.

WeChat Screenshot_20220315223222.png

   Let's use this function to extract Sylvanas's personality. Below is the video recorded to show how the personality is identified in the program.

   By comparing the personality result we got from the program and the personality we defined at the beginning, we can see that it basically finds a correct personality. We might expect more extreme values like 0.3 or 0.2. Since my program is not World of Warcraft, it is not possible to find an exact personality. However, the result is satisfying, and we can confirm that extracting a personality based on a character's behaviors is possible.

WeChat Screenshot_20220315223548.png

   Lastly, we can try to define our own character and figure out what the personality of this character is. The image below shows my version's healer's personality. Again we can see the score is not very high, but I am satisfied with this score and want to know what the personality is about this healer. From the output personality, we can find that this healer is extroverted and agreeable, which should be right since it is a healer. However, we can see that this healer is not conscientious. One reason can be that it also loves short attacks. A conscientious healer should always focus on its own job. Now later on if I want to use the same healer again, I can easily create one healer by assigning this personality, and it should always do what I am expecting it to do. 

WeChat Screenshot_20220315224003.png

Future Usages

   As I mentioned before, this idea can be used for many games. When we want to have consistent behaviors, we can apply personalities. When we want to have a character with distinct characteristics, we can apply personalities. When we want to know what would be the appropriate behavior for this AI in this situation, we can apply personalities. Personality guarantees consistency and uniqueness of behaviors.

    Imagine in an RPG game, we can design hundreds of AI characters, and all of them will have unique behaviors to every possible event. This would be the game of the year. AIs can provide a much better gameplay experience to the players since they let the world feel alive. Also, a better AI can let players have a feeling of real people instead of a piece of code. By applying personality to AIs, it is possible to create a more human-like Artificial Integellence. 

References

Anchor 1
Anchor 2
Anchor 3
Anchor 4
Anchor 5
Anchor 6
Anchor 7
Anchor 8
Anchor 9
Anchor 10
Anchor 11
bottom of page