狠狠综合久久久久综合网址-a毛片网站-欧美啊v在线观看-中文字幕久久熟女人妻av免费-无码av一区二区三区不卡-亚洲综合av色婷婷五月蜜臀-夜夜操天天摸-a级在线免费观看-三上悠亚91-国产丰满乱子伦无码专区-视频一区中文字幕-黑人大战欲求不满人妻-精品亚洲国产成人蜜臀av-男人你懂得-97超碰人人爽-五月丁香六月综合缴情在线

CS5012代做、代寫Python設計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:代做CS252編程、代寫C++設計程序
  • 下一篇:AcF633代做、Python設計編程代寫
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • NBA直播 短信驗證碼平臺 幣安官網下載 歐冠直播 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    狠狠综合久久久久综合网址-a毛片网站-欧美啊v在线观看-中文字幕久久熟女人妻av免费-无码av一区二区三区不卡-亚洲综合av色婷婷五月蜜臀-夜夜操天天摸-a级在线免费观看-三上悠亚91-国产丰满乱子伦无码专区-视频一区中文字幕-黑人大战欲求不满人妻-精品亚洲国产成人蜜臀av-男人你懂得-97超碰人人爽-五月丁香六月综合缴情在线
  • <dl id="akume"></dl>
  • <noscript id="akume"><object id="akume"></object></noscript>
  • <nav id="akume"><dl id="akume"></dl></nav>
  • <rt id="akume"></rt>
    <dl id="akume"><acronym id="akume"></acronym></dl><dl id="akume"><xmp id="akume"></xmp></dl>
    亚洲高清在线免费观看| 男女无套免费视频网站动漫| 日韩毛片在线免费看| 可以在线看黄的网站| 中文字幕亚洲欧洲| 午夜免费看视频| 欧美性猛交久久久乱大交小说| 欧美国产激情视频| 免费黄色福利视频| 国产精品丝袜久久久久久消防器材| 欧美人成在线观看| 777777av| 国产二区视频在线播放| 一本大道熟女人妻中文字幕在线 | 999精品视频在线| 三级a在线观看| 性生生活大片免费看视频| 天天操天天爽天天射| 亚洲精品mv在线观看| 亚洲图片 自拍偷拍| 异国色恋浪漫潭| 六月婷婷激情综合| av免费中文字幕| 182午夜视频| www国产免费| 女人喷潮完整视频| 亚洲天堂国产视频| 国产成人在线小视频| 国产免费一区二区三区视频| 亚洲精品一二三四五区| 91制片厂免费观看| 国产极品在线视频| 激情 小说 亚洲 图片: 伦| 亚洲精品在线网址| 波多野结衣av一区二区全免费观看| 无码播放一区二区三区| 亚洲精品视频三区| 日韩欧美一区三区| 艳母动漫在线观看| 天堂在线资源视频| 99久久国产综合精品五月天喷水| 999精品网站| japanese在线播放| 午夜免费看视频| 亚洲熟妇无码另类久久久| 亚洲欧美日韩综合网| 97中文字幕在线| 岛国av在线免费| 欧美成人三级在线视频| xxxx在线免费观看| 青青在线视频观看| 日本高清视频免费在线观看| 午夜久久久精品| 国产亚洲综合视频| 日韩不卡视频一区二区| 日韩一区二区三区久久| 亚洲成熟丰满熟妇高潮xxxxx| 成年人黄色在线观看| 午夜免费福利在线| 噼里啪啦国语在线观看免费版高清版| 精品日韩在线播放| 日本高清免费在线视频| 亚洲精品久久久中文字幕| 一级黄色香蕉视频| 日韩网址在线观看| 免费毛片小视频| 国产资源在线视频| 日本www在线视频| 日本人体一区二区| 男女超爽视频免费播放| 久久久性生活视频| 日韩小视频网站| 欧美成人高潮一二区在线看| 日韩中字在线观看| 国产亚洲精品网站| 婷婷六月天在线| 日本高清一区二区视频| 亚洲第一天堂久久| 国产手机视频在线观看| 久久最新免费视频| 国产xxxx振车| 成人黄色av片| 国产精品亚洲αv天堂无码| 一区二区三区国产免费| 中文字幕在线观看日| 无码毛片aaa在线| 99er在线视频| 那种视频在线观看| 成人综合久久网| 黄色a级片免费看| 欧美日韩一区二区在线免费观看 | 成人在线免费播放视频| 久久婷婷国产91天堂综合精品| 成人羞羞国产免费网站| 天天碰免费视频| 国产免费一区二区三区四在线播放| 久久精品视频在线观看免费| 三级网在线观看| 女性女同性aⅴ免费观女性恋| 免费av网址在线| 国产成人三级视频| 成品人视频ww入口| 8x8x最新地址| 久久国产精品免费观看| 91黄色小网站| 欧美日韩一级在线| 超碰97人人射妻| 欧美一级特黄aaaaaa在线看片| 国内性生活视频| 大地资源第二页在线观看高清版| 免费的一级黄色片| 欧美亚洲精品一区二区| 成人高清在线观看视频| 国产主播在线看| www婷婷av久久久影片| 日本www在线播放| 久久亚洲国产成人精品无码区| 国产精品拍拍拍| 国产成人a亚洲精v品无码| 在线a免费观看| 国产三级三级三级看三级| 国产免费裸体视频| 日本超碰在线观看| 男人操女人免费| 老子影院午夜伦不卡大全| 在线视频一二区| 在线观看免费成人av| 中文字幕无码不卡免费视频| 加勒比成人在线| 永久av免费在线观看| 一个色综合久久| 尤物国产在线观看| 亚洲精品一二三四五区| 国产精品69页| 狠狠热免费视频| 国产精品无码一本二本三本色| 久久久久久av无码免费网站下载| 色偷偷中文字幕| 99re99热| 波多野结衣与黑人| 青青草综合视频| 国产在线视频综合| 很污的网站在线观看| 国产肉体ⅹxxx137大胆| www.好吊操| 日本午夜激情视频| aaa毛片在线观看| 午夜dv内射一区二区| 一女二男3p波多野结衣| 一女二男3p波多野结衣| 国产精品无码乱伦| 欧美久久在线观看| 国产色一区二区三区| 十八禁视频网站在线观看| 一区二区xxx| 青青在线免费视频| 北条麻妃在线视频观看| 99视频精品免费| 午夜免费看视频| 91精品国产吴梦梦| 成人在线免费观看av| 亚洲欧美在线精品| 中文字幕在线乱| 欧美日韩在线视频一区二区三区| 国产成人黄色网址| 可以在线看黄的网站| 国产第一页视频| 爽爽爽在线观看| 欧美日韩性生活片| 三级a三级三级三级a十八发禁止| 九九热视频免费| 日韩欧美国产免费| 天堂av免费看| www.国产区| 污污污污污污www网站免费| 国产日韩一区二区在线| 天天综合成人网| xxxx一级片| 少妇高潮喷水久久久久久久久久| 日本一二区免费| 成人精品视频一区二区| 国产精品一二三在线观看| 日韩av片网站| 女人扒开屁股爽桶30分钟| 台湾无码一区二区| 国产在线视频三区| 国产精品久久久毛片| 国产一级爱c视频| 日韩一级特黄毛片| 日日噜噜噜夜夜爽爽| 国产成人黄色网址| 精品久久久久久久无码| 国产淫片免费看| 毛片在线视频播放| 成年人午夜免费视频| 国产精品va在线观看无码| 男人的天堂视频在线| 国产女主播av| 欧美日韩视频免费| 国产视频在线观看网站|