pith. sign in

Florian Metze

Identifiers

  • name variant Florian Metze 0.60 · backfill

Papers (79)

  1. Beyond Words: Towards Effective Modeling of Non-Verbal Vocalizations in ASR eess.AS · 2026 · author #10
  2. Enhancing Conversational TTS with Cascaded Prompting and ICL-Based Online Reinforcement Learning eess.AS · 2026 · author #7
  3. Error-aware Quantization through Noise Tempering cs.LG · 2022 · author #4
  4. Normalized Contrastive Learning for Text-Video Retrieval cs.IR · 2022 · author #5
  5. Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models cs.CL · 2022 · author #4
  6. SQuAT: Sharpness- and Quantization-Aware Training for BERT cs.LG · 2022 · author #4
  7. CTC Alignments Improve Autoregressive Translation cs.CL · 2022 · author #5
  8. ASR2K: Speech Recognition for Around 2000 Languages without Audio cs.CL · 2022 · author #2
  9. Masked Autoencoders that Listen cs.SD · 2022 · author #7
  10. LegoNN: Building Modular Encoder-Decoder Models cs.CL · 2022 · author #6
  11. On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization cs.CL · 2022 · author #4
  12. Robustness of Neural Architectures for Audio Event Detection cs.SD · 2022 · author #4
  13. AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification cs.SD · 2022 · author #4
  14. On Adversarial Robustness of Large-scale Audio Visual Learning cs.SD · 2022 · author #5
  15. Speech Summarization using Restricted Self-Attention cs.CL · 2021 · author #4
  16. VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding cs.CV · 2021 · author #6
  17. Differentiable Allophone Graphs for Language-Universal Speech Recognition cs.CL · 2021 · author #4
  18. Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding cs.CL · 2021 · author #5
  19. Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers cs.CV · 2021 · author #5
  20. VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding cs.CV · 2021 · author #7
  21. Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks cs.CL · 2021 · author #4
  22. Self-supervised object detection from audio-visual correspondence cs.CV · 2021 · author #5
  23. Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning cs.CV · 2021 · author #5
  24. Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models cs.CV · 2021 · author #5
  25. NoiseQA: Challenge Set Evaluation for User-Centric Question Answering cs.CL · 2021 · author #4
  26. Audio-Visual Event Recognition through the lens of Adversary cs.CV · 2020 · author #5
  27. Multimodal Speech Recognition with Unstructured Audio Masking cs.CL · 2020 · author #3
  28. On Long-Tailed Phenomena in Neural Machine Translation cs.CL · 2020 · author #4
  29. Support-set bottlenecks for video-text representation learning cs.CV · 2020 · author #4
  30. Fine-Grained Grounding for Multimodal Speech Recognition cs.CL · 2020 · author #3
  31. Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations stat.ML · 2020 · author #7
  32. How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language cs.CV · 2020 · author #6
  33. Contextual RNN-T For Open Domain ASR eess.AS · 2020 · author #5
  34. AlloVera: A Multilingual Allophone Database cs.CL · 2020 · author #8
  35. ASR Error Correction and Domain Adaptation Using Machine Translation eess.AS · 2020 · author #5
  36. Universal Phone Recognition with a Multilingual Allophone System cs.CL · 2020 · author #11
  37. Towards Zero-shot Learning for Automatic Phonemic Transcription cs.CL · 2020 · author #6
  38. Looking Enhances Listening: Recovering Missing Speech Using Images cs.CL · 2020 · author #3
  39. Gun Source and Muzzle Head Detection cs.CV · 2020 · author #3
  40. Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models cs.CL · 2019 · author #4
  41. On Compositionality in Neural Machine Translation cs.CL · 2019 · author #3
  42. Adversarial Music: Real World Audio Adversary Against Wake-word Detection System cs.CR · 2019 · author #6
  43. Multitask Learning For Different Subword Segmentations In Neural Machine Translation cs.CL · 2019 · author #3
  44. On Leveraging the Visual Modality for Neural Machine Translation cs.CL · 2019 · author #5
  45. On Dimensional Linguistic Properties of the Word Embedding Space cs.CL · 2019 · author #4
  46. SANTLR: Speech Annotation Toolkit for Low Resource Languages cs.CL · 2019 · author #5
  47. Multilingual Speech Recognition with Corpus Relatedness Sampling cs.CL · 2019 · author #4
  48. Cross-Attention End-to-End ASR for Two-Party Conversations eess.AS · 2019 · author #3
  49. Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions cs.CL · 2019 · author #3
  50. Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion cs.CL · 2019 · author #3
  51. Multimodal Abstractive Summarization for How2 Videos cs.CL · 2019 · author #4
  52. Grounding Object Detections With Transcriptions cs.MM · 2019 · author #3
  53. Acoustic-to-Word Models with Conversational Context Information cs.CL · 2019 · author #2
  54. The ARIEL-CMU Systems for LoReHLT18 cs.CL · 2019 · author #18
  55. Phoneme Level Language Models for Sequence Based Low Resource ASR cs.CL · 2019 · author #4
  56. Learned In Speech Recognition: Contextual Acoustic Word Embeddings cs.CL · 2019 · author #3
  57. Learning from Multiview Correlations in Open-Domain Videos cs.LG · 2018 · author #4
  58. Multimodal Grounding for Sequence-to-Sequence Speech Recognition cs.CL · 2018 · author #5
  59. How2: A Large-scale Dataset for Multimodal Language Understanding cs.CL · 2018 · author #7
  60. Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling cs.SD · 2018 · author #2
  61. A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling cs.SD · 2018 · author #3
  62. Activity Recognition on a Large Scale in Short Videos - Moments in Time Dataset cs.CV · 2018 · author #6
  63. Dialog-context aware end-to-end speech recognition cs.CL · 2018 · author #2
  64. Domain Robust Feature Extraction for Rapid Low Resource ASR Development cs.CL · 2018 · author #3
  65. Acoustic-to-Word Recognition with Sequence-to-Sequence Models eess.AS · 2018 · author #2
  66. Hierarchical Multi Task Learning With CTC cs.CL · 2018 · author #2
  67. End-to-End Multimodal Speech Recognition eess.AS · 2018 · author #3
  68. Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks cs.SD · 2018 · author #3
  69. Sequence-based Multi-lingual Low Resource Speech Recognition cs.CL · 2018 · author #3
  70. Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop cs.CL · 2018 · author #5
  71. A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging cs.SD · 2017 · author #4
  72. Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection cs.SD · 2017 · author #5
  73. Subword and Crossword Units for CTC Acoustic Models cs.CL · 2017 · author #3
  74. Visual Features for Context-Aware Speech Recognition cs.CL · 2017 · author #4
  75. Annotating High-Level Structures of Short Stories and Personal Anecdotes cs.CL · 2017 · author #4
  76. Comparison of Decoding Strategies for CTC Acoustic Models cs.CL · 2017 · author #3
  77. A Comparison of deep learning methods for environmental sound cs.SD · 2017 · author #3
  78. Robust end-to-end deep audiovisual speech recognition cs.CL · 2016 · author #2
  79. EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding cs.CL · 2015 · author #3

Mentions

  • 2206.03318 #6 · arxiv_oai · confidence 0.70 Florian Metze
  • 2207.06405 #7 · arxiv_oai · confidence 0.70 Florian Metze
  • 2212.11790 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2212.05603 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2210.15734 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2205.11686 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2210.07171 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2210.05200 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2209.02842 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 2203.13448 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2205.03268 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2104.06401 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2203.12122 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2110.06263 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2103.10211 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2106.05392 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2109.14084 #6 · arxiv_oai · confidence 0.70 Florian Metze
  • 2105.09996 #7 · arxiv_oai · confidence 0.70 Florian Metze
  • 2107.11628 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2106.15065 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2001.11120 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 2009.05739 #7 · arxiv_oai · confidence 0.70 Florian Metze
  • 2105.00573 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2103.08849 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2008.08143 #6 · arxiv_oai · confidence 0.70 Florian Metze
  • 2102.08345 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2010.02824 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2011.07430 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2010.08642 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 2010.04924 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2010.02384 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 2006.03411 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 1910.02211 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 2004.08031 #8 · arxiv_oai · confidence 0.70 Florian Metze
  • 2003.07692 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 2002.11800 #11 · arxiv_oai · confidence 0.70 Florian Metze
  • 2002.11781 #6 · arxiv_oai · confidence 0.70 Florian Metze
  • 2002.05639 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1907.00477 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1911.01497 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1911.00126 #6 · arxiv_oai · confidence 0.70 Florian Metze
  • 1911.03782 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1910.12368 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1910.02754 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 1908.01067 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 1908.01060 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1906.06147 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1907.10726 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1906.11604 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1906.07901 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1905.08796 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 1811.08890 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1902.08899 #18 · arxiv_oai · confidence 0.70 Florian Metze
  • 1902.07613 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1811.03865 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 1902.06833 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1810.09052 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 1810.09050 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1807.07104 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 1811.00347 #7 · arxiv_oai · confidence 0.70 Florian Metze
  • 1807.10984 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1809.00241 #6 · arxiv_oai · confidence 0.70 Florian Metze
  • 1807.09597 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 1808.02171 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 1712.06855 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1804.09713 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1804.01146 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1712.09673 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 1802.07420 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1712.09680 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1710.06917 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1802.05092 #5 · arxiv_oai · confidence 0.70 Florian Metze
  • 1712.00489 #4 · arxiv_oai · confidence 0.70 Florian Metze
  • 1708.04469 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1703.06902 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 1611.06986 #2 · arxiv_oai · confidence 0.70 Florian Metze
  • 1507.08240 #3 · arxiv_oai · confidence 0.70 Florian Metze
  • 2607.01563 #10 · arxiv_oai · confidence 0.70 Florian Metze
  • 1507.08240 #3 · backfill · confidence 0.70 Florian Metze

Frequent Coauthors