Results - Document Visual Question Answering

  • Evaluation×

    Inactive evaluations

  • Results
  • method: Human Performance2020-06-13

    Authors: DocVQA Organizers

    Affiliation: CVIT, IIIT Hyderabad, CVC-UAB, Amazon

    Description: Human performance on the test set. A small group of volunteers were asked to enter an answer for the given question and the image.

    @InProceedings{docvqa_wacv, author = {Mathew, Minesh and Karatzas, Dimosthenis and Jawahar, C.V.}, title = {DocVQA: A Dataset for VQA on Document Images}, booktitle = {WACV}, year = {2021}, pages = {2200-2209} }

    method: qwen3vl2025-09-23

    Authors: Qwen Team

    Description: The goal of Qwen3-VL is to enable the model not only to "see" images or videos but also to truly understand the world, comprehend events, and take actions. To achieve this, we have made systematic upgrades in multiple key capability dimensions, striving to transform visual large models from "perception" to "cognition", and from "recognition" to "reasoning and execution".

    method: Seed-VL-1.52025-05-13

    Authors: Seed-VL

    Affiliation: ByteDance

    Description: Seed-VL-1.5

    @misc{guo2025seed15vltechnicalreport, title={Seed1.5-VL Technical Report}, author={Dong Guo and Faming Wu and Feida Zhu and Fuxing Leng and Guang Shi and Haobin Chen and Haoqi Fan and Jian Wang and Jianyu Jiang and Jiawei Wang and Jingji Chen and Jingjia Huang and Kang Lei and Liping Yuan and Lishu Luo and Pengfei Liu and Qinghao Ye and Rui Qian and Shen Yan and Shixiong Zhao and Shuai Peng and Shuangye Li and Sihang Yuan and Sijin Wu and Tianheng Cheng and Weiwei Liu and Wenqian Wang and Xianhan Zeng and Xiao Liu and Xiaobo Qin and Xiaohan Ding and Xiaojun Xiao and Xiaoying Zhang and Xuanwei Zhang and Xuehan Xiong and Yanghua Peng and Yangrui Chen and Yanwei Li and Yanxu Hu and Yi Lin and Yiyuan Hu and Yiyuan Zhang and Youbin Wu and Yu Li and Yudong Liu and Yue Ling and Yujia Qin and Zanbo Wang and Zhiwu He and Aoxue Zhang and Bairen Yi and Bencheng Liao and Can Huang and Can Zhang and Chaorui Deng and Chaoyi Deng and Cheng Lin and Cheng Yuan and Chenggang Li and Chenhui Gou and Chenwei Lou and Chengzhi Wei and Chundian Liu and Chunyuan Li and Deyao Zhu and Donghong Zhong and Feng Li and Feng Zhang and Gang Wu and Guodong Li and Guohong Xiao and Haibin Lin and Haihua Yang and Haoming Wang and Heng Ji and Hongxiang Hao and Hui Shen and Huixia Li and Jiahao Li and Jialong Wu and Jianhua Zhu and Jianpeng Jiao and Jiashi Feng and Jiaze Chen and Jianhui Duan and Jihao Liu and Jin Zeng and Jingqun Tang and Jingyu Sun and Joya Chen and Jun Long and Junda Feng and Junfeng Zhan and Junjie Fang and Junting Lu and Kai Hua and Kai Liu and Kai Shen and Kaiyuan Zhang and Ke Shen and Ke Wang and Keyu Pan and Kun Zhang and Kunchang Li and Lanxin Li and Lei Li and Lei Shi and Li Han and Liang Xiang and Liangqiang Chen and Lin Chen and Lin Li and Lin Yan and Liying Chi and Longxiang Liu and Mengfei Du and Mingxuan Wang and Ningxin Pan and Peibin Chen and Pengfei Chen and Pengfei Wu and Qingqing Yuan and Qingyao Shuai and Qiuyan Tao and Renjie Zheng and Renrui Zhang and Ru Zhang and Rui Wang and Rui Yang and Rui Zhao and Shaoqiang Xu and Shihao Liang and Shipeng Yan and Shu Zhong and Shuaishuai Cao and Shuangzhi Wu and Shufan Liu and Shuhan Chang and Songhua Cai and Tenglong Ao and Tianhao Yang and Tingting Zhang and Wanjun Zhong and Wei Jia and Wei Weng and Weihao Yu and Wenhao Huang and Wenjia Zhu and Wenli Yang and Wenzhi Wang and Xiang Long and XiangRui Yin and Xiao Li and Xiaolei Zhu and Xiaoying Jia and Xijin Zhang and Xin Liu and Xinchen Zhang and Xinyu Yang and Xiongcai Luo and Xiuli Chen and Xuantong Zhong and Xuefeng Xiao and Xujing Li and Yan Wu and Yawei Wen and Yifan Du and Yihao Zhang and Yining Ye and Yonghui Wu and Yu Liu and Yu Yue and Yufeng Zhou and Yufeng Yuan and Yuhang Xu and Yuhong Yang and Yun Zhang and Yunhao Fang and Yuntao Li and Yurui Ren and Yuwen Xiong and Zehua Hong and Zehua Wang and Zewei Sun and Zeyu Wang and Zhao Cai and Zhaoyue Zha and Zhecheng An and Zhehui Zhao and Zhengzhuo Xu and Zhipeng Chen and Zhiyong Wu and Zhuofan Zheng and Zihao Wang and Zilong Huang and Ziyu Zhu and Zuquan Song}, year={2025}, eprint={2505.07062}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2505.07062}, }

    Ranking Table

    Description Paper Source Code
    DateMethodScoreFigure/DiagramFormTable/ListLayoutFree_textImage/PhotoHandwrittenYes/NoOthers
    2020-06-13Human Performance0.98110.97560.98250.97800.98450.98390.97400.97170.99740.9828
    2025-09-23qwen3vl0.97250.93370.98370.97850.97550.95500.92580.96581.00000.9640
    2025-05-13Seed-VL-1.50.96910.94470.98150.97640.96740.95820.91620.95221.00000.9464
    2025-11-09Star_LLM0.96710.96200.97720.96350.97240.95260.95330.96610.96550.9198
    2024-07-11qwen2-vl0.96700.92060.98160.97030.96780.96190.91350.94360.96550.9540
    2025-11-2610.96070.94130.97650.96220.96080.94460.92320.94830.96550.9473
    2025-12-1630.95900.92910.96950.96430.95920.95150.91220.94060.96550.9417
    2024-06-30InternVL2-Pro (generalist)0.95060.88880.97140.94860.95820.94460.89090.92780.96550.9410
    2025-06-04MiMo-VL-7B-RL0.95010.91590.97120.96580.93890.93440.86010.94580.95400.9292
    2025-01-16VideoLLaMA3-7B0.94940.88430.96920.94980.95320.94270.88380.92930.93100.9313
    2025-09-16LLaVA-One-Vision-1.5-8B-Instruct0.94840.89980.96690.95300.95250.92690.85130.92120.93100.9438
    2025-09-30 Snowflake Arctic-Extract 7B0.94700.91260.96710.95030.95060.93210.82450.92590.96550.9175
    2025-07-16CATI-VLM-IoT0.94480.91490.97010.95080.94900.91450.90920.93440.75860.9055
    2025-11-2000.94350.90580.96280.95400.94640.91000.88880.91010.96550.9154
    2025-04-03test0.94060.88640.96030.94320.94170.91890.86390.91530.89660.9231
    2024-09-25 Molmo-72B0.93510.88220.95480.93870.94130.91000.86880.91960.91950.9229
    2025-08-11CCK-KVQwen0.93480.85700.95660.92830.94730.92850.87740.91220.85060.9315
    2025-02-26Qwen2.5-3B-lite0.93420.88070.96220.93160.93970.91280.90220.92340.79310.8887
    2024-12-13DeepSeek-VL20.93300.88530.95750.93640.93090.92140.86850.89880.89660.9008
    2024-01-24qwenvl-max (single generalist model)0.93070.84910.94740.91950.94030.93800.86520.89220.86210.9341
    2025-10-22CATI-VLM0.92420.86170.95110.92530.93640.89250.85340.86930.82760.8879
    2024-05-10Vary (using multi crop)0.92410.89260.93720.89530.94050.94470.90350.93350.87390.9478
    2024-04-27InternVL-1.5-Plus (generalist)0.92340.83540.95560.91230.93970.90320.83130.90640.96550.9098
    2024-11-01MLCD-Embodied-7B: Multi-label Cluster Discrimination for Visual Representation Learning0.91580.82860.93150.91310.92890.90880.78040.83000.88970.8796
    2023-12-07qwenvl-plus (single generalist model)0.91410.81460.94640.89990.92770.92650.84190.87760.93100.8667
    2025-06-15granite-vision-3.3-2b0.90870.80920.93820.91180.91410.89130.81040.83520.89660.8325
    2023-11-15SMoLA-PaLI-X Specialist Model0.90840.77900.94160.89340.92620.91880.79110.85080.89660.8456
    2025-01-08PP-DocBee-2B0.90560.79980.95410.89100.92110.88000.89010.89110.75610.8893
    2023-12-07SMoLA-PaLI-X Generalist Model0.90550.77570.93810.89240.91870.91790.83640.84830.74460.8609
    2024-05-01 Snowflake Arctic-TILT 0.8B (fine-tuned)0.90200.71980.93980.91520.90150.90420.68600.84150.68970.8604
    2022-10-08BAIDU-DI0.90160.68230.91860.91390.91380.92340.68410.79490.61810.8344
    2024-04-02InternLM-XComposer2-4KHD-7B0.90020.80410.94000.89650.91430.86180.78450.82640.86210.8298
    2024-02-10ScreenAI 5B0.89880.72970.94190.89280.91580.88730.77220.81600.89660.8551
    2024-05-01Snowflake Arctic-TILT 0.8B (zero-shot)0.88810.68260.93110.90110.88670.89170.65340.82190.68970.8515
    2022-03-31Tencent Youtu0.88660.75760.94700.89320.88210.86540.66800.88770.48280.8413
    2022-01-13ERNIE-Layout 2.00.88410.64340.91770.89960.88990.90100.62230.78360.61240.8118
    2023-12-10DocFormerv2 (Single Model with 750M Parameters)0.87840.66800.93820.90760.86760.85550.58400.81230.82760.8070
    2024-10-30BlueLM-V-3B0.87750.76520.92450.86590.90050.83720.80790.82760.79310.7734
    2024-09-08neetolab-sota-v10.87590.79380.92090.85770.89460.85580.80110.86640.62070.8261
    2021-11-26Mybank-DocReader0.87550.66820.92330.87630.88960.87130.62900.80470.58050.7804
    2021-09-06ERNIE-Layout 1.00.87530.65860.89720.88640.89020.89430.63920.73310.54340.8115
    2024-08-22Mini-Monkey0.87380.73340.93500.84930.90460.83830.79310.82620.67820.7628
    2024-05-31GPT-4 Vision Turbo + Amazon Textract OCR0.87360.73460.91960.87560.86780.87090.81370.86810.89660.8464
    2021-02-12Applica.ai TILT0.87050.60820.94590.89800.85920.85810.55080.81390.68970.7788
    2023-05-31PaLI-X (Google Research; Single Generative Model)0.86790.69710.89920.84000.89550.89250.75890.72090.89660.8468
    2025-06-25table-r1_qx0.86720.79560.86600.87570.88910.84490.76240.80390.86210.8332
    2020-12-22LayoutLM 2.0 (single model)0.86720.65740.89530.87690.87910.87070.72870.67290.55170.8103
    2023-12-1054_nnrc_zephyr0.85600.61700.89240.86030.85460.90200.60830.81420.74880.8386
    2020-08-16Alibaba DAMO NLP0.85060.66500.88090.85520.87330.83970.67580.76910.54920.7526
    2020-05-16PingAn-OneConnect-Gammalab-DQA0.84840.60590.90210.84630.87300.83370.58120.76920.51720.7289
    2024-05-01PaliGemma-3B (finetune, 896px)0.84770.65430.92520.83260.87330.80990.73820.83140.79310.7571
    2024-01-21Spatial LLM v1.20.84430.63000.89170.81800.86440.88770.61060.73900.68970.8097
    2023-02-21LayoutLMv2_star_seg_large0.84300.70080.87370.83890.85360.84980.68720.78230.61810.8252
    2025-04-30Vlm(qwen)0.84110.72050.93980.84920.84250.73520.78330.89450.82760.8009
    2024-06-26MoVA-8B (generalist)0.83410.76390.84940.81310.87520.81870.65030.70480.51720.7901
    2023-06-30LATIN-Prompt + Claude (Zero shot)0.83360.66010.85530.85840.81690.87260.60210.67740.71260.8258
    2024-10-09llama3-qwenvit0.83180.73770.89280.78060.88220.80090.77320.79340.58620.7531
    2024-09-13gemma+ocr0.82820.63260.83950.82980.83460.87780.64170.61760.64100.8113
    2025-07-12DIVE-Doc (FRD)0.82670.59330.90880.82060.85000.78830.49960.77580.86210.7278
    2023-11-2736_nnrc_llama20.82390.54040.87870.79580.84750.88130.59950.79910.68970.7922
    2025-09-25Qwen2.5-VL_DocVQA_24090.82300.68780.93000.84020.84260.65440.53690.83830.65190.7541
    2024-01-11nnrc_udop_224_6ds0.82270.59090.87060.83520.83350.80860.59720.68350.58620.7472
    2024-08-02loixc-onestage0.82210.62150.84630.80200.85930.84070.58350.64530.72410.7352
    2024-07-26loixc-vqa0.81270.61820.81880.78780.85600.84960.58400.59840.39930.7445
    2025-04-30Vis(qwen)0.80930.70300.92660.82860.79250.69960.82010.87810.79310.7816
    2023-05-06Docugami-Layout0.80310.51760.88750.79020.82140.80260.50890.77530.42240.7022
    2024-03-01Vary0.79160.74150.79490.73780.84750.81010.66710.65520.74710.7888
    2025-01-10llama0.79020.69230.81670.77160.82300.77100.65950.73430.54020.7287
    2022-01-07LayoutLMV2-large on Textract0.78730.49240.87710.82180.77260.76610.48200.72760.37930.6983
    2023-01-29LayoutLMv2_star_seg0.78590.53280.84060.78590.81280.79090.48790.64680.36440.6953
    2024-05-21PaliGemma-3B (finetune, 448px)0.78020.62900.85530.72350.83360.74100.67870.76940.82760.7123
    2023-05-25YoBerDaV2 Single-page0.77490.47370.88940.75860.79620.73980.47630.71730.75860.6976
    2020-05-14Structural LM-v20.76740.49310.83810.76210.79240.75960.47560.62820.55170.6549
    2024-10-09llama3-intern6b0.76700.64190.83600.70360.84550.71840.63230.72030.58620.7121
    2022-09-18pix2struct-large0.76560.44240.88270.77020.77740.70850.53830.63200.75860.6536
    2022-12-28Submission_ErnieLayout_base_finetuned_on_DocVQA_en_train_dev_textract_word_segments_ck-140000.75990.43130.86780.77260.76410.73300.45980.69570.48280.6097
    2024-09-18Gemma 2b + OCR0.75170.47970.80670.71470.77710.83110.49220.59780.42820.6948
    2024-04-22DOLMA_multifinetuning0.74580.49640.83350.72340.78320.70440.41350.58150.51720.6593
    2024-02-13instructblip0.74290.51580.79180.70190.77510.80880.57650.58920.51720.7062
    2025-01-22Ivy-VL0.74170.58530.78740.69190.78560.76740.57530.63410.52430.7174
    2025-01-22Ivy-VL-010.74170.58530.78740.69190.78560.76740.57530.63410.52430.7174
    2020-05-15QA_Base_MRC_20.74150.48540.80150.67380.79430.81360.57400.58310.52870.7161
    2024-07-31tixc-vqa0.74130.57320.75810.69670.79650.77380.47050.53960.58620.6927
    2020-05-15QA_Base_MRC_10.74070.48900.79840.66750.79360.81310.58540.60990.49430.7384
    2020-05-15QA_Base_MRC_40.73480.47350.80400.66470.78380.80430.56180.58100.45980.7332
    2020-05-15QA_Base_MRC_30.73220.48520.79580.65620.78420.80440.56790.57300.45110.7171
    2024-10-260713ap +gpt4o(no v)0.73090.51160.80180.73790.73050.73030.46960.62400.45980.7309
    2024-01-22VisFocus-Base0.72850.38220.86950.72340.75080.67170.36560.67480.68970.5507
    2020-05-15QA_Base_MRC_50.72740.48580.78770.65500.77540.80470.54050.56190.45980.7084
    2024-05-22Dolma multifinetuning 70.72190.45320.82590.70360.75850.66770.42270.57400.58620.6452
    2022-09-18pix2struct-base0.72130.41110.83860.72530.75030.64070.42110.57530.65520.5822
    2024-10-261010ap +gpt4o(no v)0.72010.48000.76800.73350.72580.73330.47970.57670.65520.7111
    2024-04-02MiniCPM-V-20.71870.60120.80620.63120.78800.67530.68340.67890.75860.6464
    2023-01-27LayoutLM-base+GNN0.69840.47470.79730.68480.73220.63230.43980.55990.54310.5388
    2021-12-05Electra Large Squad0.69610.44850.77030.63480.73640.76440.45940.54380.51720.6470
    2023-05-25YoBerDaV1 Multi-page0.69040.34810.83350.64110.72530.68540.41910.62990.55170.6129
    2020-05-16HyperDQA_V40.68930.38740.77920.63090.74780.71870.48670.56300.41380.5685
    2020-05-16HyperDQA_V30.67690.38760.77740.61670.73320.69610.42960.53730.41380.5650
    2023-07-06GPT3.50.67590.47410.71440.65240.70360.68580.53850.50380.59540.6660
    2020-05-16HyperDQA_V20.67340.38180.76660.61100.73320.68670.48340.55600.37930.5902
    2020-05-09HyperDQA_V10.67170.40130.76930.61970.71670.69220.35980.55960.41380.5504
    2023-08-15LATIN-Tuning-Prompt + Alpaca (Zero-shot)0.66870.37320.75290.65450.66150.74630.54390.49410.34810.6831
    2023-07-14donut_base0.65900.39600.84070.66040.69870.46300.29690.69640.03450.5057
    2023-12-04ViTLP0.65880.38800.82200.67050.69620.46700.29730.63070.44830.4910
    2023-12-21DocVQA: A Dataset for VQA on Document Images0.65660.35690.76450.57750.70000.72050.42200.48020.44830.6108
    2022-09-22BROS_BASE (WebViCoB 6.4M)0.65630.37800.77570.66810.65570.61750.34970.57820.42240.5754
    2023-09-24Layoutlm_DocVQA+Token_v20.65620.39350.77640.62280.67370.67110.33850.51090.50860.5515
    2023-07-21donut_half_input_imageSize0.65360.39300.83660.65480.69500.46090.24860.69400.03450.4941
    2021-12-04Bert Large0.64470.35020.75350.54880.69200.72660.41710.52540.55170.6076
    2022-05-23Dessurt0.63220.31640.80580.64860.65200.48520.28620.58300.37930.4365
    2024-01-09dolma0.61960.40030.76420.58050.66090.52470.39580.55960.56900.4972
    2025-04-30Vlm(llama)0.59140.41670.75760.50240.65690.49380.38390.62580.56320.5198
    2020-05-09bert fulldata fintuned0.59000.41690.68700.42690.67100.73150.51240.49000.44830.5907
    2020-05-01bert finetuned0.58720.29860.70110.48490.63590.69330.46220.47510.44830.4895
    2020-04-30HyperDQA_V00.57150.31310.67800.47320.66300.57160.36230.43510.37930.4941
    2023-09-26LayoutLM_Docvqa+Token_v00.49800.23190.60350.43200.56840.47790.27680.30810.12930.4178
    2022-04-27LayoutLMv2, Tesseract OCR eval (dataset OCR trained)0.49610.25440.55230.41770.54950.59140.28880.13610.20690.4187
    2025-04-30Vis(llama)0.49190.28200.61230.45560.51670.44460.22620.52620.52870.4508
    2022-03-29LayoutLMv2, Tesseract OCR eval (Tesseract OCR trained)0.48150.22530.54400.42160.52070.57090.24300.13530.31030.3859
    2023-07-26donut_large_encoderSize_finetuned_20_epoch0.46730.22360.66910.45810.50260.26650.13560.49830.57340.3430
    2020-04-27bert0.45570.22330.52590.26330.51130.77750.48590.35650.03450.5778
    2020-05-16UGLIFT v0.1 (Clova OCR)0.44170.17660.56000.31780.53400.45200.22530.35730.44830.3356
    2024-05-21PaliGemma-3B (finetune, 224px)0.43740.40250.45160.32360.55740.40550.35000.40770.63790.4066
    2025-04-30HocrEN(Technique 2) - qwen7b0.42820.12420.35010.45940.47550.44790.23270.04000.24140.3793
    2025-04-30HocrEN(Technique 2) - qwen14b0.37940.08590.31200.41200.44670.34490.10650.04500.17240.3581
    2022-10-21Finetuning LayoutLMv3_Base 0.35960.21020.44980.38580.32620.34960.15520.34040.03450.2706
    2023-09-19testtest0.35690.30180.34070.27480.46930.31860.26820.27530.62070.3356
    2020-05-14Plain BERT QA0.35240.16870.44890.20290.43210.48120.35170.30960.03450.3747
    2020-05-16Clova OCR V00.34890.09770.48550.26700.38110.39580.24890.28750.03450.3062
    2020-05-01HDNet0.34010.20400.46880.21810.47100.19160.24880.27360.13790.2458
    2020-05-16CLOVA OCR0.32960.12460.46120.24550.36220.37460.16920.27360.06900.3205
    2023-07-21donut_small_encoderSize_finetuned_20_epoch0.31570.19350.44170.29120.34000.20750.14950.26580.31030.2644
    2020-04-29docVQAQV_V0.10.30160.20100.38980.38100.29330.06640.18420.27360.15860.1695
    2025-04-30HocrEN(Technique 2) - qwen32b0.29310.05990.22250.29600.36360.27650.06050.02640.13790.2948
    2025-05-10m-rope20.26760.24800.28850.24930.28360.26760.22820.25410.37930.2452
    2025-04-30HocrEN(Technique 2) - llama0.24880.07470.22090.20860.30770.27750.14420.05160.24140.2046
    2020-04-26docVQAQV_V00.23420.16460.31330.26230.24830.05490.22770.18560.10340.1635
    2025-04-30HocrEN(Technique 2) - mistral0.18300.03400.20190.14990.23190.17160.06110.02190.00000.1638
    2025-05-15gmini250.17140.12980.16930.17540.18080.16040.17360.23530.20690.1458
    2025-05-15doubao150.15850.11410.16190.15990.16760.14600.17410.21940.24140.1430
    2025-05-15claude370.15840.11530.16000.15940.16910.14800.15780.18680.24140.1458
    2025-05-15gpt4o0.15410.10020.15630.15150.16850.14470.18110.19490.17240.1353
    2025-05-15wenxin450.14770.10480.14940.14540.16550.12800.16630.19660.10340.1093
    2021-02-08seq2seq0.10810.07580.12830.08290.13320.08220.07860.07790.48280.1052
    2024-01-23lixiang-vlm-7b-handled0.09900.04780.07980.03480.16480.08630.13090.13950.55170.1191
    2024-01-24lixiang-vlm-7b0.06310.03130.06930.02720.08940.06390.01220.11450.55170.0826
    2025-09-15sg0.06030.08950.03980.07510.04440.07400.07560.07490.41380.0811
    2025-09-15dfnb0.05950.08640.03640.07260.04410.07460.09530.07390.41380.0933
    2025-09-15clipb0.05880.07550.03860.07380.04630.06880.07700.07700.44830.0695
    2025-09-15dfnl0.05850.08520.03800.07370.04300.07280.08780.07710.37930.0712
    2024-01-21lixiang-vlm handled0.05360.02430.02720.00970.10840.04000.06050.03950.10340.0568
    2024-01-21lixiang-vlm0.02640.01760.01230.00450.05020.02620.00780.02910.10340.0273
    2020-06-16Test Submission0.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
    2024-09-11zs0.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
    2025-06-25table-r10.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000

    Ranking Graphic

    Từ khóa » Ch Cvc