Academic works: Difference between revisions
(Created page with "As a current doctoral course student at [https://hci-lab.jp/ Human-Computer Interaction Laboratory], Hokkaido University, I '''am not very proud''' '''to say''' that I have no publications as of now (whenever you are looking at this page, that is). However, I feel that it is equally important to introduce what I am currently doing and what I have done in the slight case that you are interested. == Language-independent speech recognition == Modern speech recognition, or...") |
No edit summary |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
I am Wan Ziyu, a current doctoral course student at [https://hci-lab.jp/ Human-Computer Interaction Laboratory], Hokkaido University. I have relatively formal education on opto-electrical engineering and some computer science, and informal self-education on linguistics and phonetics. | |||
I '''am not very proud''' '''to say''' that I have no publications as of now (whenever you are looking at this page, that is). | |||
However, I feel that it is equally important to introduce what I am currently doing and what I have done in the slight case that you are interested. | However, I feel that it is equally important to introduce what I am currently doing and what I have done in the slight case that you are interested. | ||
== Language-independent speech recognition == | == Research strengths and, well, abilities == | ||
I don't have any specifically notable certificates or qualifications so take everything I list here with a pinch of salt. | |||
* Digital signal processing, with Python (librosa / numpy / scipy) and a some C++ (iPlug2 for creating VST apps) | |||
* Neural network (rather basic) with Tensorflow / Keras. Also a tad of PyTorch, but I hate migrating between toolkits. | |||
* Some HTML / JavaScript for unpretty utility webpages. | |||
* Some bash / Python for task automation. | |||
* LLM prompt composition, generic LLM utilisation. | |||
* Basic welding and electrician skills. | |||
== Research topics / interests == | |||
=== Language-independent speech recognition === | |||
To capture the phonemes (sound) rather than text, a language-independent speech recognition system is proposed. | |||
''See: [[Language-independent speech recognition]]'' | |||
=== LLM powered conversational (car) navigation agent === | |||
Based on the half-faith, half-fact (proportions may vary) that vehicle navigation is a social, conversational task, I think it might be good to enable navigation softwares to have some conversation with the driver as well. | |||
''See: [[LLM navigation agent]]'' | |||
=== (Pro?)active agent guidance === | |||
A spin-off from the navigator idea. Make the agent capable of actively asking for information might help the human user form more structured, concise and solid input. | |||
''See: [[Active agent guidance]]'' |
Latest revision as of 07:34, 11 October 2023
I am Wan Ziyu, a current doctoral course student at Human-Computer Interaction Laboratory, Hokkaido University. I have relatively formal education on opto-electrical engineering and some computer science, and informal self-education on linguistics and phonetics.
I am not very proud to say that I have no publications as of now (whenever you are looking at this page, that is).
However, I feel that it is equally important to introduce what I am currently doing and what I have done in the slight case that you are interested.
Research strengths and, well, abilities
I don't have any specifically notable certificates or qualifications so take everything I list here with a pinch of salt.
- Digital signal processing, with Python (librosa / numpy / scipy) and a some C++ (iPlug2 for creating VST apps)
- Neural network (rather basic) with Tensorflow / Keras. Also a tad of PyTorch, but I hate migrating between toolkits.
- Some HTML / JavaScript for unpretty utility webpages.
- Some bash / Python for task automation.
- LLM prompt composition, generic LLM utilisation.
- Basic welding and electrician skills.
Research topics / interests
Language-independent speech recognition
To capture the phonemes (sound) rather than text, a language-independent speech recognition system is proposed.
See: Language-independent speech recognition
Based on the half-faith, half-fact (proportions may vary) that vehicle navigation is a social, conversational task, I think it might be good to enable navigation softwares to have some conversation with the driver as well.
See: LLM navigation agent
(Pro?)active agent guidance
A spin-off from the navigator idea. Make the agent capable of actively asking for information might help the human user form more structured, concise and solid input.