[自用代码]将pdf文件中的参考文献转换为标准的格式(python)

  • Post author:
  • Post category:python


有时候系统需要提交论文参考文献,如果是使用latex书写的论文,那么参考文献格式很难直接拷贝下来。因此,写一个小程序完成根据格式的整理工作,比较简单,也没做什么代码优化,大神勿喷。

首先,将pdf的参考文献拷贝到txt文档中,会出现行列混乱的现象:

[84] DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database[C]// 2009
IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2009, Miami,
Florida, USA, June 20-25. 2009: 248–255.
[85] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing
co-adaptation of feature detectors[J]. CoRR, 2012, abs/1207.0580.
[86] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// Computer
Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12. 2014: 818–
833.
[87] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on
Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12. 2015: 1–9.
[88] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal
covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, ICML
2015, Lille, France, July 6-11. vol 37. 2015: 448–456.
[89] ZHANG W, TANG P, ZHAO L. Remote sensing image scene classification using CNN-CapsNet[J].
Remote Sensing, 2019, 11(5):494.

将上述文字保存为’ref.txt’,删除所有的回车用于后续处理

# 读取ref文档,删除所有的换行号
with open("ref.txt", "r",encoding="utf-8") as f:
    for data in f.readlines():
        with open("test.txt", 'a', encoding="utf-8") as f:
            f.write(data.replace('\n',''))
# 读取删除所有的换行号的文档
with open('test.txt', 'r', encoding='utf-8' ) as f:
    data = f.readlines()
f.close()

得到如下的’test.txt’文档:

[84] DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database[C]// 2009IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2009, Miami,Florida, USA, June 20-25. 2009: 248–255.[85] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventingco-adaptation of feature detectors[J]. CoRR, 2012, abs/1207.0580.[86] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// ComputerVision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12. 2014: 818–833.[87] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// 2015 IEEE Conference onComputer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12. 2015: 1–9.[88] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internalcovariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, ICML2015, Lille, France, July 6-11. vol 37. 2015: 448–456.[89] ZHANG W, TANG P, ZHAO L. Remote sensing image scene classification using CNN-CapsNet[J].Remote Sensing, 2019, 11(5):494.

再将上面的文字根据规则调整为符合格式要求的格式:

# 将data转换成str
data = data[0]

# 读取每个字符串,判断是否为'['+'数字'的模式
temp = []
s = ''
for index in range(1, len(data)-1):
    if data[index] == '[' and str(data[index+1]).isdigit():
        s += '\n'
        temp.append(s)
        s = ''
    else:
        s += data[index]
# 保存最后一个
temp.append(s)

with open("new_test.txt", 'a', encoding="utf-8") as f:
    for item in temp:
        f.write('['+str(item))

f.close()

最后得到如下结果:

[84] DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database[C]// 2009IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2009, Miami,Florida, USA, June 20-25. 2009: 248–255.
[85] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventingco-adaptation of feature detectors[J]. CoRR, 2012, abs/1207.0580.
[86] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// ComputerVision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12. 2014: 818–833.
[87] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// 2015 IEEE Conference onComputer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12. 2015: 1–9.
[88] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internalcovariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, ICML2015, Lille, France, July 6-11. vol 37. 2015: 448–456.
[89] ZHANG W, TANG P, ZHAO L. Remote sensing image scene classification using CNN-CapsNet[J].Remote Sensing, 2019, 11(5):494



版权声明:本文为weixin_38757163原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。