NAACL 2021 Accepted Paper List

  • Post author:
  • Post category:其他


官网地址:https://2021.naacl.org/program/accepted/



Paper List

NAACL2021接受论文通过爬虫提取到下面的excel中

共计:528篇

在这里插入图片描述


获取excel Link


提取码:2021



快速检索小工具

使用python快速检索并从arxiv获取pdf

def get_pdf(key):
    url_format = "https://arxiv.org/search/?query={}&searchtype=all&abstracts=show&order=-announced_date_first&size=50"
    rep = requests.get(url_format.format(key))
    body = etree.HTML(rep.content)
    ols = body.xpath(r'//*[@id="main-container"]/div[2]/p[1]/text()')
    if ols:
        ols = "Sorry, your query for all: {} produced no results.".format(r"Knowledge Guided Metric Learning for Few-Shot Text Classification get")
        print(ols)
    else:
        ols = body.xpath(r'//*[@id="main-container"]/div[2]/ol/li')
        for ol in ols:
            print("[PDF]:",ol.xpath(r'./p[1]/text()')[0].replace("\n","").replace(" ",""),ol.xpath(r'./div/p/span/a[1]/@href')[0])


# 查询关键词列表函数
def Search_domain_print(key_list,df,withPdf=False):
    keys = set([key.lower() for key in key_list])
    for key in keys:
        count = 0
        for i in df["title"].values.tolist():
            if key in i.lower():
                count = count + 1
                print("[{}]-[{}]:{}".format(key,count,i))
                if withPdf:
                    get_pdf(i)
                print()

if __name__ == '__main__':                
    key_list = ["Text Classification",
    #             "Sentiment Analysis",
    #             "Knowledge Graph",
               ]     
    # withPdf设置为True可以直接检索并获取pdf,但速度会很慢。
    # 也可以使用单步函数get_pdf("标题")直接查询要的文章
    excel = pd.read_excel('data/NAACL2021 Paper List.xlsx')
    Search_domain_print(key_list,excel,withPdf=False) 

效果:

在这里插入图片描述

获得pdf效果:

在这里插入图片描述



版权声明:本文为qq_35891520原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。