About Me

- Email: yohan9612@yonsei.ac.kr
- Phone: +821063236913
- Github: l-yohai
- Linkedin: l-yohai


Research interests

• Reliability of Large Language Models
• Data-centric AI

Education

• Bachelor’s Degree German Language & Literature, Cognitive Science
Yonsei University (2015 ~ 2022)

Work Experience

WRTN Technologies / AI Research Engineer (NLP Specialist) (2024.03 ~ )

Conducting Research on LLM Agent Evaluation Benchmark

Riiid / Research Scientist (NLP) (2023.07 ~ 2024.02)

Conducting Research on Large Language Models for Education
• Compete on the Huggingface Open LLM Leaderboard, achieving 1st place on Oct, 2023
• Explore the effects of instruction tuning from data (quantity, quality, diversity) and model (scale, efficiency, objective) perspectives
• Implement diverse optimization techniques (ZeRO, FSDP, and FlashAttention) for training/inference with single 8xA100 machine

Automated Essay Scoring
• Achieve state‐of‐the‐art on public essay scoring benchmarks
• Conduct “Bar exam” scoring which performs better than GPT‐4

TUNiB / NLP Engineer (2021.12 ~ 2023.02)

Korean Open‐domain Chatbot Service
• Directed dialogue data collection and quality filtering using advanced LLMs
• Developed an in‐house Korean LM for multi‐persona chatbot with self‐collected datasets
• Operated a Kakaotalk‐based chatbot service

AI Grand Challenge: Policy Support AI
• Awarded Ministry of Science and ICT Minister’s Award
• Orchestrated TableQA data collection with policy domain experts
• Developed continual learning framework with OCR‐based parsing and additional table data
• Developed an integrated QA system for processing texts, tables, and charts

Research Projects

Story Completion Competition (2nd Place) (Oct, 2023)
• Instruction fine-tuned with diverse data augmentation, achieving 2nd place on public/private leaderboard in 3 days

Achieved SOTA in KLUE Benchmark (Dec, 2021)
• Implemented R-BERT and Retrospective Reader models, and enhanced their structures and learning methods

Open Domain Question Answering Competition (Gold Medal) (Nov, 2021), Blog
• Improved top-1 accuracy of document retrieval from 32% to 78% by hybrid retrieval techniques
• Improved EM score of question answering from 62.7 to 79.9 by effective methods

Relation Extraction Competition (Silver Medal) (Oct, 2021), Blog
• Trained custom embeddings, layers, and loss function with diverse augmented dataset

Image Classification Competition (Silver Medal) (Sep, 2021)
• Utilized multi-task learning and test time augmentation techniques

Development Projects

Korean College Scholastic Ability Test (Nov, 2023)
• Compared the performance of LLMs in solving the 2024 CSAT

Daily Papers (Nov, 2023)
• Developed a tool for auto-translating and summarizing Huggingface’s daily papers into Korean using ChatGPT

PFP Story Generation (Sep, 2022)
• Completed story for 5,000 pfp characters using GPT-3

“Look, Attend, and Generate Poem” (Dec, 2021)
• Developed a web service for generating poetry from user-uploaded photos

Movie Review Rating Service (Sep, 2020)
• Developed a web service for auto-rating and archiving key movie reviews

Extracurricular Activities

• [Student] Boostcamp AI Tech - NLP Track (2021.07 ~ 2021.12)
• [Student] Innovation Academy - 42 Seoul (2020.01 ~ 2021.03)
• [Researcher] NRF - Language Information Processing 2020 (2019.07 ~ 2019.09)

Awards and Honors

2nd place, CJ Logistics CEO’s Award, The 3rd Future Tech Challenge, CJ Logistics (Sep, 2023)
3rd place, Ministry of Science and ICT Minister’s Award, AI Grandchallenge: Policy Support AI, IITP (Jan, 2023)
1st place, grand prize, The 3rd Annual University Student AI x Bookathon, SKKU, Seoul (Nov, 2021)
3rd place / 19 teams, Open Domain Question Answering Competition, Naver Connect (Nov, 2021)
5th place / 19 teams, Relation Extraction Competition, Naver Connect (Oct, 2021)
13th place / 38 teams, Mask Image Classification Competition, Naver Connect (Sep, 2021)
Scholarship, German Language and Literature Department, Yonsei University (Aug, 2019)

Author

Yohan Lee

Posted on

2021-08-22

Updated on

2024-03-30

Licensed under

댓글