[recruit] SKT AI Fellowship 3기 모집

  • 상세 페이지: https://www.sktaifellowship.com/
  • 지원 자격
    • AI 기술 개발에 관심이 있거나 경험이 있는 대학(원)생 누구나
    • 3인 1팀 또는 2인 1팀으로 구성하여 지원
  • 지원 혜택
    • 정규 채용 지원 시 서류 전형 우대
    • SKT 현업 개발자의 멘토링
    • 과제 수행을 위한 연구비 600만원
    • 최종 우수 프로젝트 상금 400만원
      (최우수팀 400만원, 우수팀 300만원)
    • SK그룹 주요 기술 행사 초청
  • 주요 일정
    • 지원 접수: 04/15(목) – 05/16(일) 자정까지
    • 서류 결과 발표: 05/25(화)
    • 온라인 인터뷰/PT 심사: 05/31(월)
    • 3기 오리엔테이션: 06/04(금)
    • 중간 리뷰: 8월
    • 프로젝트 최종 발표: 11월 초
  • 연구 과제
    • 서비스 로봇용 신규 Vision AI 응용 기술 개발
    • 지능형 Data 검색 엔진 개발
    • GAN으로 생성된 거짓 영상 판별 기술 개발
    • Smart Factory 서비스를 위한 진동/압력/온도 센서의 Anomaly Detection 개발
    • 5GX MEC 기반 Vision AI 응용 모델 개발
    • Self-supervised Learning on Billion Unlabeled Image Data 및 Full Stack Product
    • AI 기반 카메라 위치 추정 및 광고판/간판 검출 기술 연구
    • KoBERT/KoGPT/KoBART 기반 언어처리 Application 개발
    • Kinect 데이터 기반 스마트 물류 자동인식 기술 개발
    • Multi-modal 감정 인식 AI 모델 개발
    • AI 기반 고 디지털 미디어 복원 기술 개발

[seminar] A review of on-device fully neural end-to-end speech recognition and synthesis algorithms

  • 연사: Dr. Chanwoo Kim (Vice President, Samsung)
    http://www.cs.cmu.edu/~chanwook/
  • 방식: 비대면 (webex)
  • 주소: https://dongguk.webex.com/dongguk/j.php?MTID=m76990a14d544ddb14ad59f8ed6638d5d
  • 비밀번호: aixx
  • 초록: In this talk, we review various end-to-end automatic speech recognition and speech synthesis algorithms and their optimization techniques for on-device applications. Conventional speech recognition systems comprise a large number of discrete components such as an acoustic model, a language model, a pronunciation model, a text-normalizer, an inverse-text normalizer, a decoder based on a Weighted Finite-State Transducer (WFST), and so on. To obtain sufficiently high speech recognition accuracy with such conventional speech recognition systems, a very large language model (up to 100 GB) is usually needed. Hence, the corresponding WFST size becomes enormous, which prohibits their on-device implementation. Recently, fully neural network end-to-end speech recognition algorithms have been proposed. Examples include speech recognition systems based on Connectionist Temporal Classification (CTC), Recurrent Neural Network Transducer (RNN-T), Attention-based Encoder-Decoder models (AED), Monotonic Chunk-wise Attention (MoChA), transformer-based speech recognition systems, and so on. The inverse process of speech recognition is speech synthesis where a text sequence is converted into a waveform. Conventional speech synthesizers are usually based on parametric or concatenative approaches. Even though Text-to-Speech (TTS) systems based on the concatenative approaches have shown relatively good sound quality, they cannot be easily employed for on-device applications because of their immense size. Recently, neural speech synthesis approaches based on Tacotron and Wavenet started a new era of TTS with significantly better speech quality. More recently, vocoders based on LPCnet require significantly smaller computation than Wavenet, which makes it feasible to run these algorithms on on-device platforms. These fully neural network-based systems require much smaller memory footprints compared to conventional algorithms.

[seminar] Toward Automatic Math Word Problem Solving

  • 연사: Dr.Chin-Yew Lin (Microsoft)
    https://www.microsoft.com/en-us/research/people/cyl/
  • 방식: 비대면 (webex)
  • 주소: https://dongguk.webex.com/dongguk/j.php?MTID=mf0fe51d4943a75e0335e8b64b1ec3b37
  • 비밀번호: aixx
  • 초록: Computer programs can complete many tasks much more effectively and efficiently than human beings, such as calculating the product of two large numbers, or finding all occurrences of a string in a long text. However, the performance of computers on many intelligent tasks is still low. For example, in a chatting scenario, computers often generate irrelevant or incorrect responses; we can easily find amusing results in automatic machine translations; and it is still a very challenging task for state-of-the-art computer programs to solve even primary-school-level math word problems. As an exploration project in grounded and executable semantic parsing and an effort to push toward real world knowledge computing, the SigmaDolphin project at MSRA aims to build an intelligent computer system that can automatically solve math word problems. In this talk, I will summarize our findings in addressing the three major challenges of math word problem solving: dataset creation, math word problem understanding and math equation generation.

[seminar] Responsible Data Use in the context of Explainable AI

  • 초록: Responsible Data Use is becoming increasingly important for all businesses that manage data and for LinkedIn where trust is key, we are taking Responsible Data Use very seriously, from protecting members, to privacy, to data governance, and explainable AI. I will in this talk first provide an overview of LinkedIn and Data Science at LinkedIn. I will then briefly cover what responsible data use is and focus on the area of explainable AI and describe our end-to-end solution that is now being used in production for some of our internal products. The emphasis on this deep dive is to communicate what it takes to get such work into production beyond the main algorithms and theory.

[seminar] Commonsense Knowledge in AI

  • 연사: Prof. Henry Liberman
  • 소속: MIT
  • 초록: Despite all the recent successes of AI, computers still struggle to capture simple knowledge about people and everyday life — what we call “commonsense” knowledge. Commonsense knowledge underlies our ability to understand language and perform problem solving. Commonsense knowledge is different from “factual” knowledge, as you might find in Wikipedia or encyclopedias. Commonsense reasoning is also different from probabilistic reasoning, as humans (as far as we know) perform commonsense reasoning without the counting operations inherent in probability. Commonsense reasoning is about plausibility rather than truth per se, and is best performed by analogical reasoning. I will describe efforts to collect commonsense knowledge, to reason with it, and to synthesize both commonsense and probabilistic approaches. Commonsense knowledge is important in user interfaces for intelligent agents, for sensible default behavior for interfaces, and for explanation and debugging.

[seminar] Expression recognition ≠ emotion understanding: Challenges confronting the field of affective computing

  • 연사: Prof. Jon Gratch
  • 소속: University of Southern California
  • 주제: Expression recognition ≠ emotion understanding: Challenges confronting the field of affective computing
  • 초록: Many assume that a person’s emotional state can be accurately inferred by surface cues such as facial expressions and voice quality, or through physiological signals such as skin conductance or heart rate variability. Indeed, this assumption is reflected in many commercial “affect recognition” tools. For example, companies provide software that promises to “understand how your customers and viewers feel when they can’t or won’t say so themselves.” However, research in in affective science highlights that the connection between surface cues, like facial expressions, and feelings of emotion are quite weak and highly context-specific. Even worse, these methods often fail to correctly classify these surface cues outside pristine laboratory conditions. In this talk, I will review some of the biases and potential solutions for expression recognition. I will then discuss how to move from expressions to understanding. Along the way, I will emphasize the problematic nature of the term “emotion recognition”: It leads users to over generalize the capabilities of the technology (in that expressions don’t necessarily indicate emotion) but also undersell its power (in that expressions can indication important information about many things besides emotion).