Thank you for taking the time to visit my website portfolio! Here I've laid out some of my favourite past projects, and what I'm currently working on/discovering!
I'm a Data Scientist with 3+ years of experience, specializing in statistical analysis and model development!
I graduated with a B.S. in Mathematics with a focus in both Data Analysis and Business Intelligence from the University of South Florida.
Within Data Science, I've found myself most interested in the applications of neurosymbolic AI and the development of language models.
Outside of my professional background, I love cooking, taking a walk with music, and constantly challenging myself in the gym; I am currently located in New York City.
Feel free to check out my GitHub here! or connect with me on LinkedIn
Table and Summaries of Projects
Below you can find some of my personal projects that I've completed off the clock; while each project is unique in tools and skills, my goal is always the same: challenge myself!
Each project has report with results and figures, linked in both the table and the summary header!
Utilising the Spotify API to user real-life user data to develop a playlist recommender model aimed at increasing user engagement (i.e. decrease customer churn).
PyTorch based reinforcement learning model aimed at learning to play specified actions in an open-world video-game (Pokémon Platinum) with the assistance of a computer vision model.
In this project, I set out to build an AI system capable of playing Pokémon Platinum, an open-world, story-driven video game, using computer vision and reinforcement learning with human feedback (RLHF)!
The project consists of two main models: an annotation model fine-tuned from YOLOv9s for object detection, and a gameplay model built and trained from scratch with RLHF to make strategic decisions within the game environment.
Technical Stack
2. Dataset and Preprocessing
Dataset preprocessing workflow
To develop the computer vision annotation model, I manually annotated over 150 gameplay images, labeling key objects such as Pokémon, items, NPCs, and locations. Data augmentations, including random rotation, color jittering, and contrast adjustments, along with resizing transformations, were applied to enhance the model's robustness and generalization.
Once the annotation model was trained, it was used to automatically annotate an additional 450 gameplay states. Each state was paired with a set of possible actions (in JSON format)[automated in action_prep.py script] based on nine potential options: A, B, X, Y, Up, Down, Left, Right, and None.
These annotated state-action pairs were then used to pretrain the gameplay model, preparing it for the main reinforcement learning phase.
3. Project Structure
This project was divided into several key components to develop a robust AI model capable of navigating and interacting within the game environment. Each component played a critical role in building an effective and adaptive gameplay agent:
1. Development and training of a computer vision annotation model to identify key game elements.
2. Construction of a PyTorch-based gameplay model with hyperparameter tuning for optimal performance.
3. Pretraining of the gameplay model on annotated data, followed by reinforcement learning with human feedback to refine its interactions.
4. Evaluation of the model's performance in achieving in-game objectives.
3.1 COMPUTER VISION MODEL
The computer vision model, based on YOLOv9, detects key objects like Pokémon, items, and NPCs to enable situational awareness for the gameplay model.
Architecture: YOLOv9, fine-tuned on a custom dataset of 150 annotated gameplay images.
Dataset & Augmentation: Key objects manually labeled, with data augmentation (rotation, color jitter) applied to improve robustness.
Training: Optimized hyperparameters for high detection accuracy. After training, the model auto-annotated 450 additional images.
Outcome: Achieved high accuracy, providing reliable annotations for the gameplay model's pretraining phase.
class PokemonModelLSTM(nn.Module):
def __init__(self, input_size, conv1_dim, conv2_dim, hidden_size, num_layers, num_actions, use_annotations=False):
super(PokemonModelLSTM, self).__init__()
# Store hyperparameters as attributes for easy access
self.input_size = input_size
self.hidden_size = hidden_size
self.num_layers = num_layers
self.num_actions = num_actions
self.use_annotations = use_annotations
# Convolutional layers to extract features from the game state
self.conv1 = nn.Conv2d(3, conv1_dim, kernel_size=3, stride=2, padding=1)
self.conv2 = nn.Conv2d(conv1_dim, conv2_dim, kernel_size=3, stride=2, padding=1)
# LSTM to capture temporal dependencies; input_size will be adjusted dynamically in forward pass
self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)
# Fully connected layer to map the LSTM output to the number of actions
self.fc = nn.Linear(hidden_size, num_actions)
def forward(self, x, annotations=None):
# Input x is expected to be of shape (batch_size, seq_len, C, H, W)
batch_size, seq_len, C, H, W = x.size()
# Reshape input for convolutional layers
x = x.view(batch_size * seq_len, C, H, W)
# Apply convolutional layers with ReLU activations
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
# Flatten the output for the LSTM layer
x = x.view(batch_size, seq_len, -1) # Shape: (batch_size, seq_len, conv2_dim * H' * W')
# Dynamically adjust LSTM input size if using annotations
if self.use_annotations and annotations is not None:
annotations = annotations.view(batch_size, seq_len, -1) # Reshape annotations to (batch_size, seq_len, annotation_features)
x = torch.cat((x, annotations), dim=-1) # Concatenate along the feature dimension
lstm_input_size = x.size(-1) # Set the LSTM input size based on the concatenated features
# Reinitialize LSTM if needed to match new input size
if lstm_input_size != self.lstm.input_size:
self.lstm = nn.LSTM(input_size=lstm_input_size, hidden_size=self.hidden_size, num_layers=self.num_layers, batch_first=True)
# Apply LSTM
x, _ = self.lstm(x) # x: (batch_size, seq_len, hidden_size)
# Fully connected layer: map the LSTM output to the action space
x = self.fc(x[:, -1, :]) # Use the last time step's output
return x
3.3 Model Pretraining and Reinforcement Learning Training
Prior to reinforcement learning, the mdoel was trained on foundational knowledge through gameplay states/action in order to optmize the learning process.
Reinforcement Learning Training Loop
*Note that this is for exploitation, not exploration.
3.4 EVALUATION
The evaluation phase focused on assessing the model's performance in object detection and gameplay decision-making accuracy.
Computer Vision Model Metrics:
Mean Average Precision (mAP): Used to measure detection accuracy for Pokémon, items, and NPCs.
Precision and Recall: Evaluated to ensure a balance between correctly identifying objects and avoiding false positives.
Gameplay Model Metrics:
Accuracy of Actions: Percentage of correct actions taken by the model during gameplay.
Reward Score: Cumulative reward achieved per episode to assess decision-making quality in achieving game objectives.
Consistency: Frequency of successful task completions (e.g., navigating to specific locations).
Human Feedback Integration: Analyzed the model’s improvement in decision-making accuracy based on human feedback adjustments.
Real-Time Performance: Ensured that the model could perform efficiently within the game environment with minimal lag or delays.
Limited Dataset for Training Computer Vision Model
Manually annotated a small set of images (150) and applied data augmentation techniques (rotation, color jitter, contrast adjustments).
Achieved 70%+ accuracy on key object detection (Pokémon, NPCs), verified on test data with diverse augmentations.
Managing Memory for Contextual Gameplay
Implemented an extended LSTM module to retain game context (map location, Pokémon health, etc.).
Improved task performance by 25% in complex scenarios requiring memory, such as revisiting NPCs or avoiding areas with low health.
Difficulty in Model Tuning for Real-Time Gameplay
Applied hyperparameter tuning and used human feedback to optimize the LSTM model's parameters.
Reached 90% success in task completion (e.g., navigating to locations) across test runs.
Handling Diverse Game Scenarios
Used reinforcement learning with human feedback to train the model across varied scenarios.
Enhanced model’s flexibility, achieving a 15% reduction in failed actions during unpredictable events or rare situations.
Action Space Complexity in Reinforcement Learning
Simplified the action space to directional and core commands (A, B, X, Y) for efficient learning.
Reduced training time by 30%, with a 20% improvement in episode convergence rates.
5. Results
The results showcase the model’s effectiveness across object detection and gameplay decision-making, with improvements observed through human feedback integration and real-time performance.
Computer Vision Model:
Mean Average Precision (mAP): Achieved 70.5% for detecting Pokémon, items, and NPCs accurately.
Show graph
Precision and Recall: The model achieved a balanced precision and recall of approximately 80%, effectively reducing false positives while accurately identifying in-game objects.
Reward Score: Averaged a cumulative reward score of 160 per episode, demonstrating effective decision-making aligned with game objectives.
Task Consistency: Successfully completed navigation tasks 75% of the time, reliably reaching designated locations in the game.
Human Feedback Integration: Increased action accuracy by 12% post-feedback, enhancing decision quality and responsiveness to in-game challenges.
Real-Time Performance: Achieved smooth operation with <1-second response time, maintaining real-time gameplay without perceptible lag.
6. Conclusion & Future Improvements
Conclusion
As a personal project, this project laid the foundation for the feasibility of using a combination of computer vision and reinforcement learning to create an AI model that can autonomously navigate and interact within a complex open-world game
With the incorporation of human feedback, the model's performance was refined, making it adaptable to real-time challenges.
Future Improvements
Fine-Tune Reward Structure: : Enhance the reward system in the reinforcement learning model to incentivize more complex in-game strategies and behaviors.
Explore Multi-Agent Scenarios: : Develop the model to handle multi-agent environments or interactions with multiple NPCs for a richer gaming experience.
Enhance Model Memory: : Implement a memory module (e.g., extended LSTM or transformer-based) to track map, location, Pokémon status, and recent actions, enabling strategy based on game history.
Knowledge Graph Movie Recommender GNN
Movie data knowledge graph in Neo4j! (Blue:Year, Orange:Genre, Purple:Title, Red: Popularity)
This is bold and this is strong. This is italic and this is emphasized.
This is superscript text and this is subscript text.
This is underlined and this is code: for (;;) { ... }. Finally, this is a link.
Heading Level 2
Heading Level 3
Heading Level 4
Heading Level 5
Heading Level 6
Blockquote
Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.
Preformatted
i = 0;
while (!deck.isInOrder()) {
print 'Iteration ' + i;
deck.shuffle();
i++;
}
print 'It took ' + i + ' iterations to sort the deck.';