Skills & Stack
Technical Foundations
- Programming & Backend Development
-
- Python:NumPy, Pandas, scikit-learn, SciPy, spaCy, NLTK
- Web Frameworks:Flask, Django, FastAPI
- JavaScript:Node.js, Express.js
- REST API Design & Integration
- Artificial Intelligence & Machine Learning
-
- Model Development Concepts
- Data Preprocessing and Feature Engineering
- Data Visualization: Matplotlib, Seaborn, Plotly
- AI Systems & Retrieval Engineering
-
- Retrieval-Augmented Generation (RAG) Architecture
- Embedding Pipelines & Vector Storage
- Rule Based and LLM Systems
- Model Performance Optimization (CPU-bound environments)
- LLM Deployment via API (Express / FastAPI)
- Infrastructure & Deployment
-
- Linux VPS Hosting & Server Configuration
- NGINX Hosting + Reverse Proxy Setup
- PM2 Process Management
- Resource Constrained AI Optimization
It's Me!
Applied Systems & Experiments
onebadev.com - VPS Infrastructure & Deployment
The portfolio site itself is not hosted on a managed platform, it runs on a
self-configured Ubuntu VPS, handled end-to-end from initial setup through
production deployment. This includes provisioning the server environment, configuring
NGINX as a reverse proxy to route traffic to a Node.js backend (Python in development), and managing the
application processes with PM2 to ensure uptime and automatic restarts on failure.
SSL/TLS termination is handled at the NGINX layer via Let's Encrypt, keeping the
backend service unexposed to the public internet. Static assets are served directly
through NGINX while API traffic is proxied to the appropriate internal port.
The server also runs UFW-based firewall rules with only necessary ports exposed,
and auth.log monitoring as an ongoing exercise in understanding what a live
public-facing server actually looks like under real traffic conditions.
- OS & Hosting: Ubuntu VPS, self-managed
- Web Server: NGINX - reverse proxy, static asset serving, SSL termination
- Process Management: PM2 - process persistence, auto-restart, log management
- Security: UFW firewall, Let's Encrypt (Certbot)
- Backend: Node.js + Express proxied via NGINX (Python / FastAPI in development)
BadBot: Production RAG Chatbot (onebadev.com)
Originally developed as a rule-based Python chatbot for a Natural Language
Processing course, this assistant began as a structured Q&A system built
to recognize predefined prompts and return custom responses. While the early
version relied on pattern matching and scripted logic, the broader goal became
a bit more ambitious: evolve the chatbot into a system capable of intelligently
answering questions about my background, projects, and technical experience.
To support that idea, the project was rewritten in JavaScript to align with
my website's backend architecture. This transition established the foundation
for integrating large language models and retrieval-based techniques, with
the long-term objective of building a Retrieval-Augmented Generation (RAG)
system. The intent is for the chatbot to move beyond fixed responses and instead
dynamically retrieve relevant information about my work, education, and portfolio,
enabling a more natural, accurate, and context-aware interactions.
At its core, the project represents a shift from static question-answer logic
toward an intelligent, searchable knowledge assistant built around my personal
technical ecosystem.
- Backend:Node.js + Express.js, CORS
- LLM:Ollama (Gemma model)
- Ingestion:DirectoryLoader (PDFLoader + TextLoader)
- Chunking / Retrieval:RecursiveCharacterTextSplitter + Top-k retrieval
- Embeddings:OllamaEmbeddings
- Vector Store:FAISS vector store, with on-disk persistence
Journal | Notes:
- Jan 2026: The site itself has been up and running (and will continue to be), but after deploying BadBot to my VPS I have encountered a performance bottleneck. While responses were fast on my personal PC, the minimal, CPU only server (no GPU) has been struggling under the load. The model, Gemma3:1b, is too heavy for the available hardware, pushing the CPU to max out causing response times to rise. To hopefully solve this, I am currently downsizing the LLM to a leaner Gemma3:270M and reimplementing a lightweight, rule-based system to handle common questions quicker. In this hybrid setup, the AI model is only invoked for more complex queries, helping balance model capability with hardware constraints, keeping the site fast and responsive. Barring any issues with the updated model, (or anything else) upcoming improvements include lightweight in-session memory to further improve flow while maintaining hardware efficiency.
- March 2026: It has been a while but after continuing to work through the limitations of CPU-bound performance on the VPS, the approach has recently shifted towards cloud-based models. The current model is still Gemma3 but has been upgraded to 27b-cloud, trying to ease the hardware bottleneck that had been limiting response quality and speed. The local model experiments, showed that lighter models had a tendency to hallucinate more frequently while heavier ones pushed the vCPU to its limits. Additional improvements include a server-side response cache to help reduce redundant calls for repeated questions and refined document chunking with natural text boundary separators for cleaner retrieval.
- April 12, 2026: Back to the model drama, had to go back down in models, currently running the 1b. This one can take some time and somewhat hallucinate, but the idea is still there. Looking into the issue and will update.