Both of your questions can be answered in a similar way. Both the named entity recognition and part-of-speech tagging pipelines use machine learning models. Such models will make mistakes and these mistakes are hard for us to correct in general, because models are not a deterministic set of rules. The accuracy of a model depends on several factors, including:
- The size of the training data;
- the quality of the training data;
- the size of the model.
Taking your named entity recognition example, the en_core_web_sm model is (as the name suggests) a small model. It uses a relatively small convolutional network, but also does not use static embeddings that are pretrained on a large corpus. Since the model is relatively limited, the model may have picked up patterns like: words that are capitalized are typically names, except if they occur at the beginning of the sentence (since all sentence-initial words are capitalized). This may be the reason that it fails to annotate your example correctly. However, if you use the en_core_web_lg model instead, you will see that that model will return the correct annotation:
('Dumbledore', 'PERSON')
You'd have to dive deeper to understand why it works in this case. But en_core_web_lg is a larger model that uses pretrained word embeddings. So, it may e.g. be the case that Dumbledore occurs in the set of word embeddings and the vector is similar to other names, allowing the model to extrapolate that since the vector of _ Dumbledore_ is similar to that of names it has seen in the training data that Dumbledore must also be a name.
Similar reasoning applies to your second question, the models are a trade-off between size, speed and accuracy. Also in this case en_core_web_lg does predicts 'VERB' consistently as the tag for finished.
So what does this mean in practice? First, models make mistakes. Second, if the error rate is not acceptable, you may want to look at larger models (such as md/lg/trf); or if you are working in a very specific domain, annotating more training data. Finally, do not underestimate the power of a set of rules. If you are working in a particular domain, say processing Harry Potter novels, you could get a lot of milage out of making a small set of rules to recognize names since it is a finite set (using e.g. the attribute ruler).
python - Can a Named Entity Recognition (NER) spaCy model or any code like an entity ruler around it catch my new further date patterns also as DATE entities? - Stack Overflow
Named Entity Recognition for Resume Parsing
I've had a good experience using spaCy, but I only use it for names, although, by default, it will also attempt to extract organizations and locations.
I use it in a non-English language, and it does require some rather extensive text pre-formatting, but it is very accurate (certainly more than 70%) - even when extracting names that are in other languages - and very fast too.
I'm sure you can fine tune it to extract other types of entities, and they even have a visual tool to assist with that that looks pretty awesome.
By comparison, I used BERT for a multi-label classification project; it takes way longer and was way more complicated to setup.
More on reddit.com