Portfolio

Hi, my name is

Yohan Lee

for Your Own Humanistic AI

Creating AI believing in infinite possibilities of problems solving.

About Me

I am an AI Research Engineer at KakaoBank, focusing on developing finance-specific large language models (LLMs) and autonomous agents. Previously, I worked at Coxwave, WRTN Technologies, Riiid, and TUNiB, where I contributed to advancing LLMs and NLP technologies across various domains. My notable achievements include 1st place on the Huggingface Open LLM Leaderboard and the Ministry of Science and ICT Minister’s Award. My research interests cover instruction tuning, model efficiency, and domain adaptation, with a particular emphasis on real-world applications in finance and education. I am passionate about pushing the boundaries of AI and committed to shaping the future of language technologies. Here are a few technologies I've been working with recently:

Python
PyTorch
Huggingface
Deepspeed
Flash Attention
Parallelism(DP/TP/ZeRO)
Quantization
PEFT
Offloading

Education

Yonsei University

Bachelor of Arts in German Language and Literature, Cognitive Science - Yonsei University

Mar 2015 - Feb 2022

I received an A+(4.3/4.3) in the following courses:

Software Programming
Understanding and Utilization of Artificial Intelligence
Digital Language Data and Humanities
Mathematics and Programming
Machine Learning and its Applications
… and more

Extracurricular Activities

Presidents of the Piano club in Yonsei University.
- Organized and led two large-scale concerts, showcasing musical talents and collaborative efforts.
- Initiated and conducted a special concert at Severance Hospital, offering performances dedicated to patients, demonstrating community involvement and leadership skills.

Experiences

May 2024 - Present

AI Research Engineer

KakaoBank

Development of Finance-Specific LLMs and Autonomous Agent

Developed in-house LLMs tailored to banking applications
Developed Finance-Specific LLMs and Autonomous Agent

Jul 2024 - May 2025

AI Researcher

Coxwave

Development of Domain-Specific LLMs for Quantum Physics

Built an AI Tutor and Assistant for quantum physics using continual pre-training and fine-tuning (SFT, DPO)
Built a unified model (Llama 3.1, 8B) with 81.1% accuracy on MCQ test sets, outperforming GPT-4o (63.7%)
Secured a $140k contract and delivered AI solutions under the AI Voucher program

Research on Many-Shot Jailbreaking

Developed a comprehensive attack framework for many-shot scenarios
Analyzed long-context vulnerabilities in open-source LLMs

Mar 2024 - Jun 2024

AI Research Engineer (NLP Specialist)

WRTN Technologies

Conducting Research on LLM Agent Evaluation Benchmark

Design and implement a benchmark system for in-the-wild human-assistant dialogues
Develop comprehensive evaluation framework for assessing the performance of human-assistant interactions in real-world scenarios

Jul 2023 - Feb 2024

Research Scientist (NLP)

Riiid

Conducting Research on Large Language Models for Education

Compete on the Huggingface Open LLM Leaderboard, achieving 1st place on Oct, 2023
Explore the effects of instruction tuning from data (quantity, quality, diversity) and model (scale, efficiency, objective) perspectives
Implement diverse optimization techniques (ZeRO, FSDP, and FlashAttention) for training/inference with single 8xA100 machine

Automated Essay Scoring

Achieve state‐of‐the‐art on public essay scoring benchmarks
Conduct “Bar exam” scoring which performs better than GPT‐4

Dec 2021 - Feb 2023

NLP Engineer

TUNiB

Korean Open‐domain Chatbot Service

Directed dialogue data collection and quality filtering using advanced LLMs
Developed an in‐house Korean LM for multi‐persona chatbot with self‐collected datasets
Operated a Kakaotalk‐based chatbot service

AI Grand Challenge: Policy Support AI

Awarded Ministry of Science and ICT Minister’s Award
Orchestrated TableQA data collection with policy domain experts
Developed continual learning framework with OCR‐based parsing and additional table data
Developed an integrated QA system for processing texts, tables, and charts

Projects

python pytorch

Story Completion Competition

Instruction fine-tuned (korean 13B lm) with diverse data augmentation, achieving 2nd place on public/private leaderboard in 3 days

python pytorch huggingface

Achieved SOTA in KLUE Benchmark

- Implemented R‐BERT and Retrospective Reader models, and enhanced their structures and learning methods - Achieved SOTA in TC, STS, RE, MRC, NLI tasks

python pytorch huggingface

Open Domain Question Answering Competition (Gold Medal)

- Improved top-1 accuracy of document retrieval from 32% to 78% by hybrid retrieval techniques - Improved EM score of question answering from 62.7 to 79.9 by effective methods

Watch More