術(shù)語庫和語料庫合集

2022/11/21 20:59
術(shù)語庫和語料庫合集

翻譯實(shí)踐中,我們常常會(huì)遇到很多字典中查詢不到的詞匯和表達(dá),這個(gè)時(shí)候就可以借助術(shù)語庫和語料庫來解決問題。


在線術(shù)語庫

中國關(guān)鍵詞:

http://www.china.org.cn/chinese/china_key_words/

中國特色話語對(duì)外翻譯標(biāo)準(zhǔn)化術(shù)語庫:

http://210.72.20.108/index/index.jsp

中國核心詞匯:

https://www.cnkeywords.net/index

中國思想文化術(shù)語:

https://www.chinesethought.cn/TermBase.aspx

聯(lián)合國術(shù)語庫:https://unterm.un.org/UNTERM/pohttps://unterm.un.org/UNTERM/portal/welcome

術(shù)語在線:

https://www.termonline.cn/index

國家教育研究院術(shù)語庫:

https://terms.naer.edu.tw/download/

明代職官中英辭典:

https://escholarship.org/uc/uci_libs

中國規(guī)范術(shù)語:

https://shuyu.cnki.net/#/

Grand Dictionnaire Terminologique:

https://gdt.oqlf.gouv.qc.ca/

TERMIUM:

https://www.btb.termiumplus.gc.ca/tpv2alpha/alpha-eng.html?lang=eng

語帆術(shù)語寶:

http://termbox.lingosail.com/

微軟術(shù)語庫:

https://www.microsoft.com/zh-cn/language

世界衛(wèi)生組織術(shù)語庫:

https://www.who.int/home/cms-decommissioning

電子工程術(shù)語表:

https://www.maximintegrated.com/cn/glossary/definitions.mvp/terms/all

FreeMdict 100GB超大離線詞庫下載:

https://downloads.freemdict.com/

一本詞典(專利術(shù)語庫):

http://www.onedict.com/

國家標(biāo)準(zhǔn)《物流術(shù)語》 :https://logistics.nankai.edu.cn/_upload/article/76/83/1c5da71e4b8e9838ae0843c8cb3d/3a1617ed-acfb-4504-9e18-c079e98e6154.pdf

冬奧會(huì)術(shù)語查詢網(wǎng)站:

owgt.lingosail.com/

音樂術(shù)語查詢:

http://dictionary.t-classical.com/

European Union Language and terminology:

https://eur-lex.europa.eu/summary/glossary.html?locale=en

IATE (Interactive Terminology for Europe) EU’s terminology database:

https://iate.europa.eu/home

香港法律中英術(shù)語:

https://www.elegislation.gov.hk/glossary/chi

Magic Search:

https://magicsearch.org/

Linguee:

https://www.linguee.com/

The Free Dictionary:

https://www.thefreedictionary.com/

Glosbe:

https://glosbe.com/

在線語料庫

國內(nèi)

BCC語料庫:

http://bcc.blcu.edu.cn/

語料庫在線:

http://www.aihanyu.org/cncorpus/index.aspx

北京大學(xué)中國語言學(xué)研究中心:

ccl.pku.edu.cn

北外語料庫語言學(xué):

bfsu-corpus.org/

現(xiàn)代漢語平衡語料庫:

https://www.sinica.edu.tw/SinicaCorpus/

古漢語語料庫/近代漢語標(biāo)記語料庫/漢籍電子文獻(xiàn):

https://www.sinica.edu.tw/ch

樹圖數(shù)據(jù)庫:

http://treebank.sinica.edu.tw/

搜文解字:

http://words.sinica.edu.tw/

媒體語言語料庫(MLC):

https://ling.cuc.edu.cn/RawPub/

哈工大信息檢索研究室對(duì)外共享語料庫資源:

http://ir.hit.edu.cn/demo/ltp/Sharing_Plan.htm

泛話語地區(qū)漢語共時(shí)語料庫(LiVaC):

http://www.livac.org/index.php?lang=sc

中文語言資源聯(lián)盟:

http://www.chineseldc.org/

中央研究院近代漢語標(biāo)記語料庫:

http://lingcorpus.iis.sinica.edu.tw/early/

《紅樓夢(mèng)》漢英平行語料庫:

http://corpus.usx.edu.cn/hongloumeng/images/shiyongshuoming.htm

國外

BNC——英國國家語料庫(British National Corpus):

http://www.natcorp.ox.ac.uk/

BOE——柯林斯英語語料庫(the Bank of English):

http://www.collinslanguage.com/language-resources/dictionary-datasets/

ANC——美國國家語料庫(American National Corpus):

https://www.anc.org/

蘭開斯特漢語語料庫 (LCMC):

http://ota.oucs.ox.ac.uk/scripts/download.php?otaid=2474

SKETCH ENGINE多語言語料庫:

https://www.sketchengine.eu/

BASE——英國學(xué)術(shù)口語語料庫(British Academic Spoken English Corpus):

https://warwick.ac.uk/fac/soc/al-archive-deleted/research/base

Lextutor:

http://www.lextutor.ca/

My Memory:

https://mymemory.translated.net/

TAUS:

https://datamarketplace.taus.net/

TTMEM:

https://www.ttmem.com/terminology/download-translation-memory/

TinyTM:

http://tinytm.sourceforge.net/

DGT Translation Memory:

https://magmatranslation.com/en/free-translation-memory/

European Parliament Proceedings Parallel Corpus 1996-2011:

https://statmt.org/europarl/

University of Maryland Parallel Corpus Project: The Bible:

http://users.umiacs.umd.edu/~resnik/parallel/bible.html

Aligned Hansards of the 36th Parliament of Canada:

https://www.isi.edu/research_groups/nlg/home

EU Publication Offices:

https://op.europa.eu/en/web/general-publications/publications

Wikimedia Downloads:

https://dumps.wikimedia.org/backup-index.html

United Nations Parallel Corpus:

https://conferences.unite.un.org/UNCorpus/

European language pairs:

https://www.statmt.org/wmt13/translation-task.html#download

parallel corpus search:

http://paralela.clarin-pl.eu/

UM-Corpus: A Large English-Chinese Parallel Corpus(自然語言處理與中葡機(jī)器翻譯實(shí)驗(yàn)室):

http://nlp2ct.cis.umac.mo/um-corpus/um-corpus-license.html

Clarin Parallel corpora:

https://www.clarin.eu/resource-families/parallel-corpora

The PKU 863 Chinese-English Parallel Corpus:

https://www.lancaster.ac.uk/fass/projects/corpus/863parallel/

BYU corpora:?

https://corpus.byu.edu/


其它子語料庫


A collection of translated literature:

https://opus.nlpl.eu/Books.php

A collection of EU Translation Memories provided by the JRC:

https://opus.nlpl.eu/DGT.php

Documents from the Catalan Goverment:

https://opus.nlpl.eu/DOGC.php

European Central Bank corpus:

https://opus.nlpl.eu/ECB.php

European Medicines Agency documents:

https://opus.nlpl.eu/EMEA.php

The EU bookshop corpus:

https://opus.nlpl.eu/EUbookshop.php

The European constitution/European Parliament Proceedings:

https://opus.nlpl.eu/EUconst.php

French-English Gigal-Word Corpus:

https://opus.nlpl.eu/giga-fren.php

GNOME localization files:

https://opus.nlpl.eu/GNOME.php

News stories in various languages:

https://opus.nlpl.eu/GlobalVoices.php

English WaC corpus:

https://opus.nlpl.eu/hrenWaC.php

JRC-Acquis- legislative EU texts:

https://opus.nlpl.eu/JRC-Acquis.php

KDE4 – KDE4 localization files (v.2):

https://opus.nlpl.eu/KDE4.php

KDEdoc – the KDE manual corpus:

https://opus.nlpl.eu/KDEdoc.php

MBS – Belgisch Staatsblad corpus:

https://opus.nlpl.eu/MBS.php

memat – Xhosa/English parallel data:

https://opus.nlpl.eu/memat.php

MontenegrinSubs – Montenegrin movie subtitles:

https://opus.nlpl.eu/MontenegrinSubs.php

MultiUN – Translated UN documents:

https://opus.nlpl.eu/MultiUN.php

News Commentary, v9.0, v9.1:

https://opus.nlpl.eu/News-Commentary-v11.php

OfisPublik – Breton – French parallel texts:

https://opus.nlpl.eu/OfisPublik.php

OO – the OpenOffice.org corpus:

https://opus.nlpl.eu/OpenOffice-v2.php

OpenOffice.org 3 corpus:

https://opus.nlpl.eu/OpenOffice-v3.php

OpenSubtitles – the opensubtitles.org corpus:

https://opus.nlpl.eu/OpenSubtitles-v1.php

OpenSubtitles2016 – snapshot from 2016:

https://opus.nlpl.eu/OpenSubtitles-v2016.php

OpenSubtitles2018 – new complete version:

http://opus.nlpl.eu/OpenSubtitles-v2018.php

ParaCrawl corpus:

https://opus.nlpl.eu/ParaCrawl.php

ParaCrawl corpus:

http://opus.nlpl.eu/ParCor

ParCor – A Parallel Pronoun-Coreference Corpus/PHP – the PHP manual corpus:

http://opus.nlpl.eu/ParCor

Regeringsf?rklaringen – a tiny example corpus:

http://opus.nlpl.eu/RF.php

SETIMES?– A parallel corpus of the Balkan languages:

http://opus.nlpl.eu/SETIMES.php

SPC – Stockholm Parallel Corpora:

https://opus.nlpl.eu/SPC.php

Tatoeba?– A DB of translated sentences:

http://opus.nlpl.eu/Tatoeba.php

TedTalks hr-en:

http://opus.nlpl.eu/TedTalks.php

TED Talks 2013:

http://opus.nlpl.eu/TED2013.php

Tanzil?– A collection of Quran translations:

http://opus.nlpl.eu/Tanzil.php

TEP?– The Tehran English-Persian subtitle corpus:

http://opus.nlpl.eu/TEP.php

Ubuntu?– Ubuntu localization files:

http://opus.nlpl.eu/Ubuntu.php

UN?– Translated UN documents:

http://opus.nlpl.eu/UN.php

Wikipedia?– translated sentences from Wikipedia:

http://opus.nlpl.eu/Wikipedia.php

WikiSource?– (small en-sv sample only:

http://opus.nlpl.eu/WikiSource.php

WMT News Test Sets:

http://opus.nlpl.eu/WMT-News.php

The Xhosa – English Navy corpus:

http://opus.nlpl.eu/XhosaNavy.php