Jongyoon (John) Kim

Ph.D. Student in Artificial Intelligence
specialised in Natural Language Processing

Very confident of applying new knowledge to improve system. Love to learn new things and soliding knowledge.

Contact

Phone (+82)-10-2589-9365
Facebook DicoTiar
Linkedin Jongyoon Kim

Location

301 building, 1, Gwanak-ro, Gwanak-gu
Seoul, South Korea 08826 KR

Education

Seoul National University

2023-03-02 — 2029-03-02
Interdisciplinary Program in Artificial Intelligence , Doctor of Philosophy
97 (4.0 / 4.3)
Courses
  • 2023-Spring Probabilistic Graphical Models
  • 2023-Spring Studies in Artificial Intelligence
  • 2023-Spring Social Computing
  • 2023-Spring English Research Paper Writing for Graduate Students: Majors in Natural Sciences and Engineering
  • 2023-Spring Understanding and Practicing University Teaching
  • 2023-Autumn Natural Language Processing
  • 2023-Autumn Recent Topics in Artificial Intelligence
  • 2023-Autumn Machine Listening
  • 2023-Autumn Seminar in Computational Linguistics

University of Bristol

2018-06-01 — 2021-07-10
Electrical and Electronic Engineering , Bachelor
First Class Honours (4.0+/4.5)
Courses
  • EENG30002 Networking Protocol Principles 3
  • EENG30009 Individual Research Project 3
  • EENG30010 Mobile Communication Systems
  • EENG30013 Power Electronics, Machines & Drive Technologies
  • EENG31400 Digital Filters and Spectral Analysis 3
  • EENG34030 Embedded and Real-Time Systems
  • EENG36000 Electronics 3
  • EMAT30670 Optimisation Theory and Applications
  • EENG23000 Control 2
  • EENG21000 Signals and Systems
  • EENG22000 Communications

Advanced Placement (self-taught)

2018-01-01 — 2018-6-31
Physics , University Level Equivalent
average: 4/5
Courses
  • AP Physics 2: 3/5
  • AP Physics C (Mechanics): 5/5

IEN Institute, NCUK Korea

2017-03-01 — 2017-12-31
Chemistry , High School, A-Level Equivalent
A A* A* A*
Courses
  • English for Academic Purpose (EAP): A (IELTS 7.0 Equivalent)
  • Chemistry: A*
  • Further Mathematics: A*
  • Pure Mathematics: A*

Work

NAVER Corp.

Summer Software Engineering Intern

2021-12-01 — 2022-06-30

Data Engineer (Paid Internship). Refactored and improved PGM Click model logic along with Query Reformulation candidate extraction on Hadoop Ecosystem.

  • Customisation of Apache Airflow for Working Group
  • PGM click model logic refactoring and improvement with Scala-Spark on Hadoop Ecosystem. Hyper-parameter tuning for all PGM models with grid-search. Pipelined and scheduled the model training daily with Airflow (train the model on spark cluster with data from hdfs).
  • Built Universal Retrieval Feature Experiment demo tool (For in-house purposes.)
  • CLI plugin built for convenience and improvement of in-house retrieval feature evaluation web page (GUI). Presented the plugin at the in-house Engineering Day conference.
  • Extracted Query Reformulation (QR) candidates with cross attention score of sentence BERT embeddings. Pipelined and scheduled the extraction by calling the pytorch model deployed on Kubernetes from spark cluster and storing on hdfs

University of Bristol

Teaching Assistant

2020-09-01 — 2021-06-30

As undergraduate Teaching Assistance, I have worked for following units. To support junior students, I have replied to their queries during drop-in sessions via Teams (video calls). For SMPS project, I have supported about 5 groups' overall project MATLAB simulation (Simulink) design and theoretical understanding.

  • Electronics 2
  • Switching Mode Power Supply (SMPS) Project
  • Fields and Devices

NAVER Corp.

Summer Software Engineering Intern

2020-06-01 — 2020-09-30

Software Engineer Intern. Built admin tool as webpage for manage search results and engineered features to improve search engine's rank model.

  • Front-End(FE) build with Vue.js for Internal Admin page. The Back-End(BE) of the page is built by senior-engineer. By cooperating with others for developing one product (FE-BE) led me to understand the importance of communication with colleague and documenting the clear endpoint. This project guided me to have interest on Full-stack developing.
  • Feature Engineering for video search engine's rank model. To improve the performance of established rank model, more than 100 features about the video that were not used are analysed and processed. The engineered features improved Normalized Discounted Cumulative Gain (NDCG) of rank model and also the precision and recall. The ordinary Machine-learning methods are also important as much as how deep-learning works.

Envisible

Summer Internship

2019-06-01 — 2019-09-30

Maintenance of hardware(HW) and software(SW) content for Kids cafe / Construction of multiple kiosks for new branch open.

  • Supporting maintenance process of old kiosks.
  • Maintenance/modification/full-change of HW kiosk due to new application development. Experienced on following machines for parts production. (CNC milling machine, vertical/horizontal milling machine, 3D printer, table save and some handy machines) For CNC milling machine and 3D printer, I had experience of designing part of large parts/whole part of small parts with Autodesk Fusion360.
  • Proposed and applied the method for HW and SW maintenance and improvement.
  • Application and calibration of sensors and cameras for each kiosk. Most of kiosks had static focus camera so the camera was installed to plastic block which is going to be implanted to the kiosk and the camera is calibrated.

Envisible

Summer Internship

2018-06-01 — 2018-09-30

Maintenance of hardware(HW) and software(SW) content for Kids cafe / Construction of multiple kiosks for new branch open.

  • Documenting basic data from various books for children for designing education programme.
  • Participated on kids cafe maintenance project. Hardware for new contents of the kids cafe are built and installed. The existing hardwares are removed and disassembled.
  • The hardware specs of parts for the kiosk are measured and designed with Autodesk Fusion360.

Publications

Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability

2024-09
Published by CLEF 2024, LongEval

Benchmarking the performance of information retrieval (IR) methods are mostly conducted within a fixed set of documents (static corpora). However, in real-world web search engine environments, the document set is continuously updated and expanded. Addressing these discrepancies and measuring the temporal persistence of IR systems is crucial. By investigating the LongEval benchmark, specifically designed for such dynamic environments, our findings demonstrate the effectiveness of a listwise reranking approach, which proficiently handles inaccuracies induced by temporal distribution shifts. Among listwise rerankers, our findings show that ListT5, which effectively mitigates the positional bias problem by adopting the Fusion-in-Decoder architecture, is especially effective, and more so, as temporal drift increases, on the test-long subset.

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

2024-06
Published by Findings of ACL 2024

Pre-trained language models (PLMs) exhibit promise in retrieval tasks but struggle with out-of-domain data due to distribution shifts. Addressing this, generative domain adaptation (DA), known as GPL, tackles distribution shifts by generating pseudo queries and labels to train models for predicting query-document relationships in new domains. However, it overlooks the domain distribution, causing the model to struggle with aligning the distribution in the target domain. We, therefore, propose a \textbf{\oursfull} (\ours) to guide the model to consider the domain distribution knowledge at the level of both a single document and the corpus, which is referred to as observation-level feedback and domain-level feedback, respectively. Our method effectively adapts the model to the target domain and expands document representation to unseen gold query terms using domain and observation feedback, as demonstrated by empirical results on the BEIR benchmark.

Single Image Super-Resolution(SISR) for Text-based Image: Benchmark and GUI

2021-04
Published by Jongyoon Kim

Benchmark results of multiple SISR methods for text-based images and implemented GUI for users to use the methods more conveniently.

Awards

AI Fellowship

2023~
Seoul National University, Interdisciplinary Program in Artificial Intelligence

Awarded to student who studies Artificial Intelligence.

Paul Dirac scholarship

2019-09
University of Bristol - Faculty of Engineering

Awarded to student who has academic excellance on University application.

NCUK-IEN graduation 2nd place

2018-09
NCUK-IEN Institute

Awarded to student who has academic excellance on NCUK Foundation Year programme. (2nd place)

Volunteer

IEN Institute, NCUK Korea

Student Ambassador, Reporter

2018-01-01 — 2019-05-30

Explains about the NUCK Foundation Year programme, Illustrates the UK university life via blog posting.

  • Writing short essays or reports with some pictures and videos about the University of Bristol for students who are studying the International Foundation Year in Korea.

Skills

Web-development

MySQL (MariaDB), HTML, CSS, JS (Vue.js), and python (Django, Flask)

Server

Network protocols, Docker, Docker-Compose, Kubernetes, and nginx

Data Engineering / Machine Learning / Deep Learning

python (Airflow ETL, crawler development), Tensorflow(Keras), and statsmodel, scikit-learn (for ARIMA model, SIER model design)

Electrical & Electronic Engineering

Arduino controller deisgn/coding and IoT Devices: Wi-Fi communications, Bluetooth communications

Mechanical Engineering

3D CAD/CAM model design: Fusion360, 3D Printer, CNC milling machine, and Handy machines

Languages

Korean

Native speaker

English

Fluent speaker

Interests

Natural Language Processing

NLP, Information Retrieval (IR), Large Language Model, and Reinforcement Learning

Application of Machine Learning

Control System with Machine Learning and Machine learning for forecasting

Parallel, Distributed computing

faster computing and large volume of data

References

Available on Request

...