Dacheng Shen | Academic Homepage

🎓 Education

Washington State University, Tri-Cities Jan 2026 - Present

Ph.D. in Computer Science

University of Southern California June 2024 - Dec 2025

M.S. in Computer Science

University of Connecticut Sept 2020 - May 2024

B.S. in Computer Science

Dean's List Honors

📄 Publications, Preprints & Presentations

The Trafficker's Pitch: Detecting Deceptive Recruitment in Online Job Boards 🔗 2026

CySoc Workshop at AAAI ICWSM, 2026

Siyi Zhou, Peiran Qiu, Tanishq Salkar, Leonardo Blas Urrutia, Dacheng Shen, Deyang Hsu, Eun Cheol Choi, and Emilio Ferrara.

A Simplex-Inspired Architecture for Integrating Quantum Capabilities into Cyber-Physical Systems Mar. 2026

Poster Presentation — 2nd HAIQ Workshop, Pittsburgh, PA

Tamim Ahmed, Dacheng Shen, Mengyu Liu, and Monowar Hasan. Poster presented at the 2nd HAIQ Workshop, Pittsburgh, PA, Mar. 2026.

🔬 Research

LLM-Guided Power Grid Contingency Mitigation Jan. 2026 - Apr. 2026

Research Assistant / Ph.D. Researcher

Designed an LLM-guided decision framework for post-contingency power grid restoration under N-1 reliability, integrating AC-OPF verification and runtime assurance to generate and refine sparse generator redispatch actions.
Evaluated on IEEE benchmark systems under in-distribution and out-of-distribution contingency scenarios, benchmarking against OPF-based and rule-based baselines.
Achieved improved restoration success rate and reduced control cost under safety constraints, demonstrating LLM-assisted decision support for cyber-physical power systems.

PyHazards: AI-Powered Hazard Prediction Framework 🔗 Sep. 2025 - Jan. 2026

Research Assistant, FORTIS Lab at USC (PI: Prof. Yue Zhao) 🔗

Integrated Transformer-based models into PyHazards, an open-source Python library for multi-hazard prediction.
Developed modular data, training, and evaluation pipelines and benchmarked model performance against public baselines.
Standardized model interfaces and experiment workflows with the RAI Lab team to improve reproducibility and extensibility.

Comparative Narrative Analysis with Claude 3.7 Mar. 2025 - Jun. 2025

Research Assistant, FORTIS Lab at USC (PI: Prof. Yue Zhao) 🔗

Designed and automated a Claude 3.7 Sonnet-based narrative comparison pipeline for 13 news-story pairs, generating 208 structured outputs across conflict, unique, holistic, and overlapping dimensions with four prompt levels.
Implemented scalable inference through the Anthropic API with automated output parsing, organizing responses into structured JSON and human-readable text for systematic downstream analysis and annotation.
Developed multi-level prompt rubrics incorporating relevance, factual consistency, coherence, fluency, and bias criteria to support more reliable narrative comparison and subsequent human evaluation.

OEM Controls Automated Angle Test for AS5 🔗 Aug. 2023 - May. 2024

Capstone Project Lead

Led the design of a Python-based automation system on Ubuntu for calibrating and testing OEM Controls' AS5 angle sensor with a UFACTORY xArm 6.
Integrated a Tkinter GUI, robotic-arm API, CANOpen sensor communication, real-time progress monitoring, and CSV result logging to automate testing from −80° to 80° in 5° increments.

📚 Projects

Bullet-Hell Game — Software Design & Architecture Jan. 2026 - Apr. 2026

Developed a 2D bullet-hell shooter in Java, featuring multi-wave enemy spawning, a two-stage final boss, and Easy/Normal/Hard difficulty scaling.
Applied six OOP design patterns across 90+ classes — Strategy (IMovementStrategy, IBulletPattern with 8 behaviors), Factory & Builder (EnemyFactory, EnemyBuilder), State, and Command — within a layered codebase split into a reusable engine library and game project.
Implemented a data-driven wave sequencer, circle-based collision system, real-time HUD, and sprite animation manager, with service interfaces (IAssetProvider, ICollisionSystem, IEntityManager) for dependency injection.

Transfer Learning for Imbalanced Waste Classification 🔗 Jan. 2025 - May 2025

Conducted a controlled ablation of three VGG16-based training configurations on the 9-class, 4,752-image RealWaste dataset, using stratified train/validation/test splits and a custom classification head.
Compared frozen-backbone and block5 fine-tuning strategies with inverse-frequency class weights and RandomOverSampler across three configurations; fine-tuning with cost-sensitive weights improved test accuracy from 60.6% to 69.9% and Macro F1 by 10.3 pp.
Performed confusion-matrix-based error analysis revealing persistent inter-class confusions, including Plastic/Metal and Paper/Cardboard, and proposed a class-selective hybrid imbalance strategy for future work.

AI-Driven Travel Assistant Design – FlySmart Mar. 2025 - May. 2025

Led a requirements engineering project to design an AI-powered flight booking assistant
Conducted stakeholder analysis, surveys, and interviews, derived user personas, empathy maps, and categorized requirements
Created and validated a Figma prototype with features including flight search, visa management, smart alerts, and personalized AI chat interface
Applied agile methodology with story-driven backlog, sprint-based development roadmap, and formal validation via usability testing and prototyping

Semantic Retrieval and QA System with Weaviate & RAG Sep. 2024 - Dec 2024

Built a semantic search pipeline using Weaviate with text2vec-transformers for vector-based brand similarity via GraphQL nearText queries.
Developed a PDF QA chatbot with Streamlit using PyPDFLoader, ChromaDB, and Hugging Face embeddings in a RAG framework, and benchmarked Weaviate vs. ChromaDB on retrieval accuracy and scalability.
Automated ingestion and querying with Python and Bash, orchestrating deployment with Docker Compose.

Large-Scale Sentiment Analysis on Amazon Office Reviews 🔗 Aug. 2025 - Oct. 2025

Cleaned and normalized 200K balanced Amazon review texts through HTML/URL removal, tokenization, stopword filtering, and lemmatization; trained unigram TF-IDF models with LinearSVC, Logistic Regression, and Perceptron, achieving 89.6% test accuracy.
Compared two GloVe representations using a feed-forward neural network—100-d average pooling and 1000-d concatenation of the first 10 tokens—showing that average pooling improved test accuracy from 77.7% to 83.3% and reduced overfitting.

Student Admin Design Sep. 2023 - Dec. 2024

Collaborated on a group project, using Figma to design the product model and present the plan to stakeholders
Developed and implemented the code to bring each Figma design feature to life
Completed the website development and thoroughly tested all functionalities

Food Ordering App Design Aug. 2021 - Nov. 2021

Used Java to build a food ordering app that could generate an invoice with the name and price of the dish and the total price after ordering
Group leader, used GitHub to share and merge information and parts of what each group member had finished
Completed a report and made a presentation

💼 Experience

Software Development Engineer Jul 2023 - Aug 2023

HICOCA Intelligent Equipment Technology

Maintained web server backends and optimized database operations, ensuring data integrity, security, and system stability.
Developed and integrated new client-side UI features into the existing framework, significantly improving overall system functionality and user experience.

🔧 Skills

💻

Programming Languages

Python C# Java C/C++ SQL Bash/Shell HTML

🤖

AI/ML & NLP

PyTorch TensorFlow/Keras scikit-learn Hugging Face Transformers Transfer Learning RAG Semantic Search Prompt Engineering NLTK Gensim

🔬

Domain & Research

Cyber-Physical Systems Reinforcement Learning Time-Series Forecasting

🗄️

Data & Databases

NumPy Pandas MySQL MongoDB ChromaDB Weaviate GraphQL Data Preprocessing

📐

Software Engineering

REST APIs OOP Design Patterns Modular Architecture Agile (Scrum) Git

🛠️

Tools & Platforms

Linux/Ubuntu Docker AWS Streamlit Jupyter Figma