Hi, my name is

Yohan Lee

for Your Own Humanistic AI

Creating AI believing in infinite possibilities of problems solving.

About Me

I am a AI researcher at Coxwave, focusing on advancing large language models (LLMs). Previously, I studied german linguistics and cognitive science at the Yonsei University. My notable research achievement includes achieving 1st place in the Huggingface Open LLM Leaderboard. My research interests involve a comprehensive exploration of instruction tuning, considering aspects like data quality, quantity, diversity, and model scale, efficiency, and objectives. As a passionate professional, I am committed to continually contributing to the future of AI and NLP. Here are a few technologies I've been working with recently:
  • Python
  • PyTorch
  • Huggingface
  • Deepspeed
  • Flash Attention
  • Parallelism(DP/TP/ZeRO)
  • Quantization
  • PEFT
  • Offloading

Education

Bachelor of Arts in German Language and Literature, Cognitive Science - Yonsei University
Mar 2015 - Feb 2022

I received an A+(4.3/4.3) in the following courses:

  • Software Programming
  • Understanding and Utilization of Artificial Intelligence
  • Digital Language Data and Humanities
  • Mathematics and Programming
  • Machine Learning and its Applications
  • … and more

Extracurricular Activities

  • Presidents of the Piano club in Yonsei University.
    • Organized and led two large-scale concerts, showcasing musical talents and collaborative efforts.
    • Initiated and conducted a special concert at Severance Hospital, offering performances dedicated to patients, demonstrating community involvement and leadership skills.

Experiences

Mar 2024 - Jun 2024
AI Research Engineer (NLP Specialist)
WRTN Technologies

Conducting Research on LLM Agent Evaluation Benchmark

  • Design and implement a benchmark system for in-the-wild human-assistant dialogues
  • Develop comprehensive evaluation framework for assessing the performance of human-assistant interactions in real-world scenarios
Jul 2023 - Feb 2024
Research Scientist (NLP)
Riiid

Conducting Research on Large Language Models for Education

  • Compete on the Huggingface Open LLM Leaderboard, achieving 1st place on Oct, 2023
  • Explore the effects of instruction tuning from data (quantity, quality, diversity) and model (scale, efficiency, objective) perspectives
  • Implement diverse optimization techniques (ZeRO, FSDP, and FlashAttention) for training/inference with single 8xA100 machine

Automated Essay Scoring

  • Achieve state‐of‐the‐art on public essay scoring benchmarks
  • Conduct “Bar exam” scoring which performs better than GPT‐4
Dec 2021 - Feb 2023
NLP Engineer
TUNiB

Korean Open‐domain Chatbot Service

  • Directed dialogue data collection and quality filtering using advanced LLMs
  • Developed an in‐house Korean LM for multi‐persona chatbot with self‐collected datasets
  • Operated a Kakaotalk‐based chatbot service

AI Grand Challenge: Policy Support AI

  • Awarded Ministry of Science and ICT Minister’s Award
  • Orchestrated TableQA data collection with policy domain experts
  • Developed continual learning framework with OCR‐based parsing and additional table data
  • Developed an integrated QA system for processing texts, tables, and charts

Projects

Story Completion Competition
python pytorch
Story Completion Competition
Instruction fine-tuned (korean 13B lm) with diverse data augmentation, achieving 2nd place on public/private leaderboard in 3 days
Achieved SOTA in KLUE Benchmark
python pytorch huggingface
Achieved SOTA in KLUE Benchmark
- Implemented R‐BERT and Retrospective Reader models, and enhanced their structures and learning methods - Achieved SOTA in TC, STS, RE, MRC, NLI tasks
Open Domain Question Answering Competition (Gold Medal)
python pytorch huggingface
Open Domain Question Answering Competition (Gold Medal)
- Improved top-1 accuracy of document retrieval from 32% to 78% by hybrid retrieval techniques - Improved EM score of question answering from 62.7 to 79.9 by effective methods
Relation Extraction Competition (Silver Medal)
python pytorch huggingface
Relation Extraction Competition (Silver Medal)
- Trained custom embeddings, layers, and loss function with diverse augmented dataset
Image Classification Competition Competition (Silver Medal)
python pytorch huggingface
Image Classification Competition Competition (Silver Medal)
- Utilized multi-task learning and test time augmentation techniques
Korean College Scholastic Ability Test (CSAT)
python pytorch huggingface
Korean College Scholastic Ability Test (CSAT)
- Compared the performance of various LLMs in solving the 2024 CSAT
Daily Papers
python pytorch huggingface
Daily Papers
- Developed a tool for auto-translating and summarizing Huggingface's daily papers into Korean using ChatGPT
PFP Story Generation
python pytorch huggingface
PFP Story Generation
- Completed the story for 5,000 pfp characters using GPT-3
Look, Attend, and Generate Poem
python pytorch huggingface
Look, Attend, and Generate Poem
- Developed a web service for generating poetry from user-uploaded photos
Movie Review Rating Service
python pytorch huggingface
Movie Review Rating Service
- Developed a web service for auto-rating and archiving key movie reviews

Achievements

SKKU, 3rd Annual University Student AI x Bookathon
Winner of the essay generation task.
Naver Connect, Open Domain Question Answering Competition
Gold Medal (Nov, 2021)
Naver Connect, Relation Extraction Competition
Silver Medal (Oct, 2021)
Naver Connect, Mask Image Classification Competition
Silver Medal (Sep, 2021)
Yonsei Univ, German Language and Literature Department
Scholarship (Aug, 2019)

Get in Touch

My inbox is always open. Whether you have a question or just want to say hi, I’ll try my best to get back to you!