About Me
Research interests
• Reliability of Large Language Models
• Data-centric AI
Education
• Bachelor’s Degree German Language & Literature, Cognitive Science
Yonsei University (2015 ~ 2022)
Work Experience
WRTN Technologies / AI Research Engineer (NLP Specialist) (2024.03 ~ )
Conducting Research on LLM Agent Evaluation Benchmark
Riiid / Research Scientist (NLP) (2023.07 ~ 2024.02)
Conducting Research on Large Language Models for Education
• Compete on the Huggingface Open LLM Leaderboard, achieving 1st place on Oct, 2023
• Explore the effects of instruction tuning from data (quantity, quality, diversity) and model (scale, efficiency, objective) perspectives
• Implement diverse optimization techniques (ZeRO, FSDP, and FlashAttention) for training/inference with single 8xA100 machine
Automated Essay Scoring
• Achieve state‐of‐the‐art on public essay scoring benchmarks
• Conduct “Bar exam” scoring which performs better than GPT‐4
TUNiB / NLP Engineer (2021.12 ~ 2023.02)
Korean Open‐domain Chatbot Service
• Directed dialogue data collection and quality filtering using advanced LLMs
• Developed an in‐house Korean LM for multi‐persona chatbot with self‐collected datasets
• Operated a Kakaotalk‐based chatbot service
AI Grand Challenge: Policy Support AI
• Awarded Ministry of Science and ICT Minister’s Award
• Orchestrated TableQA data collection with policy domain experts
• Developed continual learning framework with OCR‐based parsing and additional table data
• Developed an integrated QA system for processing texts, tables, and charts
Research Projects
Story Completion Competition (2nd Place) (Oct, 2023)
• Instruction fine-tuned with diverse data augmentation, achieving 2nd place on public/private leaderboard in 3 days
Achieved SOTA in KLUE Benchmark (Dec, 2021)
• Implemented R-BERT and Retrospective Reader models, and enhanced their structures and learning methods
Open Domain Question Answering Competition (Gold Medal) (Nov, 2021), Blog
• Improved top-1 accuracy of document retrieval from 32% to 78% by hybrid retrieval techniques
• Improved EM score of question answering from 62.7 to 79.9 by effective methods
Relation Extraction Competition (Silver Medal) (Oct, 2021), Blog
• Trained custom embeddings, layers, and loss function with diverse augmented dataset
Image Classification Competition (Silver Medal) (Sep, 2021)
• Utilized multi-task learning and test time augmentation techniques
Development Projects
Korean College Scholastic Ability Test (Nov, 2023)
• Compared the performance of LLMs in solving the 2024 CSAT
Daily Papers (Nov, 2023)
• Developed a tool for auto-translating and summarizing Huggingface’s daily papers into Korean using ChatGPT
PFP Story Generation (Sep, 2022)
• Completed story for 5,000 pfp characters using GPT-3
“Look, Attend, and Generate Poem” (Dec, 2021)
• Developed a web service for generating poetry from user-uploaded photos
Movie Review Rating Service (Sep, 2020)
• Developed a web service for auto-rating and archiving key movie reviews
Extracurricular Activities
• [Student] Boostcamp AI Tech - NLP Track (2021.07 ~ 2021.12)
• [Student] Innovation Academy - 42 Seoul (2020.01 ~ 2021.03)
• [Researcher] NRF - Language Information Processing 2020 (2019.07 ~ 2019.09)
Awards and Honors
• 2nd place, CJ Logistics CEO’s Award, The 3rd Future Tech Challenge, CJ Logistics (Sep, 2023)
• 3rd place, Ministry of Science and ICT Minister’s Award, AI Grandchallenge: Policy Support AI, IITP (Jan, 2023)
• 1st place, grand prize, The 3rd Annual University Student AI x Bookathon, SKKU, Seoul (Nov, 2021)
• 3rd place / 19 teams, Open Domain Question Answering Competition, Naver Connect (Nov, 2021)
• 5th place / 19 teams, Relation Extraction Competition, Naver Connect (Oct, 2021)
• 13th place / 38 teams, Mask Image Classification Competition, Naver Connect (Sep, 2021)
• Scholarship, German Language and Literature Department, Yonsei University (Aug, 2019)