The NER label scheme varies by language, and depends heavily on what kind of training data was available. You need to check the "Label Scheme" entry on the model page, which should have an NER section. For example, here's Japanese.

🌐
spaCy
spacy.io › api › entityrecognizer
EntityRecognizer · spaCy API Documentation
The labels currently added to the component and their internal meta information. This is the data generated by init labels and used by EntityRecognizer.initialize to initialize the model with a pre-defined label set. During serialization, spaCy will export several data fields used to restore ...
🌐
spaCy
spacy.io › usage › linguistic-features
Linguistic Features · spaCy Usage Documentation
Label: Entity label, i.e. type. Using spaCy’s built-in displaCy visualizer, here’s what our example sentence and its named entities look like:
🌐
spaCy
spacy.io › usage › spacy-101
spaCy 101: Everything you need to know · spaCy Usage Documentation
Label: Entity label, i.e. type. Using spaCy’s built-in displaCy visualizer, here’s what our example sentence and its named entities look like:
🌐
Kaggle
kaggle.com › code › curiousprogrammer › entity-extraction-and-classification-using-spacy
Entity Extraction and Classification using SpaCy
Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds
🌐
spaCy
spacy.io › api › data-formats
Data formats · spaCy API Documentation
spacy convert lets you convert ... Tokens outside an entity are set to "O" and tokens that are part of an entity are set to the entity label, prefixed by the BILUO marker....
Find elsewhere
🌐
spaCy
spacy.io › usage › training
Training Pipelines & Models · spaCy Usage Documentation
Under the hood, the command delegates to the label_data property of the pipeline components, for instance EntityRecognizer.label_data. The JSON format differs for each component and some components need additional meta information about their labels. The format exported by init labels matches what the components need, so you should always let spaCy auto-generate the labels for you.
🌐
MLK
machinelearningknowledge.ai › home › named entity recognition (ner) in spacy library
Named Entity Recognition (NER) in Spacy Library - MLK - Machine Learning Knowledge
August 7, 2021 - In the below example of Spacy NER, we first create a Spacy object and instantiate it with the sample text and assign it to doc variable. The named entities can be simply extracted by iterating over the doc.ent object. In each iteration the entity text is printed by using ent.text and entity label by using ent.label_.
🌐
spaCy
spacy.io › api › entityruler
EntityRuler · spaCy API Documentation
This component assigns predictions basically the same way as the EntityRecognizer. Predictions can be accessed under Doc.ents as a tuple. Each label will also be reflected in each underlying token, where it is saved in the Token.ent_type and Token.ent_iob fields.
🌐
Notebook Community
notebook.community › rishuatgithub › MLPy › nlp › UPDATED_NLP_COURSE › 02-Parts-of-Speech-Tagging › 02-NER-Named-Entity-Recognition
Named Entity Recognition (NER)
from spacy.tokens import Span # Get the hash value of the ORG entity label ORG = doc.vocab.strings[u'ORG'] # Create a Span for the new entity new_ent = Span(doc, 0, 1, label=ORG) # Add the entity to the existing Doc object doc.ents = list(doc.ents) + [new_ent]
🌐
Label Studio
labelstud.io › blog › evaluating-named-entity-recognition-parsers-with-spacy-and-label-studio
Evaluate NER parsers with spaCy and Label Studio | Label Studio
From the project in Label Studio, click Settings and click Labeling Interface. Select the Named Entity Recognition template and paste the contents of the named_entities.txt as the labels for the template.
Top answer
1 of 3
22

the entity label is an attribute of the token (see here)

import spacy
from spacy import displacy
nlp = spacy.load('en_core_web_lg')

s = "His friend Nicolas is here."
doc = nlp(s)

print([t.text if not t.ent_type_ else t.ent_type_ for t in doc])
# ['His', 'friend', 'PERSON', 'is', 'here', '.']

print(" ".join([t.text if not t.ent_type_ else t.ent_type_ for t in doc]) )
# His friend PERSON is here .

Edit:

In order to handle cases were entities can span several words the following code can be used instead:

s = "His friend Nicolas J. Smith is here with Bart Simpon and Fred."
doc = nlp(s)
newString = s
for e in reversed(doc.ents): #reversed to not modify the offsets of other entities when substituting
    start = e.start_char
    end = start + len(e.text)
    newString = newString[:start] + e.label_ + newString[end:]
print(newString)
#His friend PERSON is here with PERSON and PERSON.

Update:

Jinhua Wang brought to my attention that there is now a more built-in and simpler way to do this using the merge_entities pipe. See Jinhua's answer below.

2 of 3
6

A more elegant modification to @DBaker's solution above when entities can span several words:

import spacy
from spacy import displacy
nlp = spacy.load('en_core_web_lg')
nlp.add_pipe("merge_entities")

s = "His friend Nicolas J. Smith is here with Bart Simpon and Fred."
doc = nlp(s)

print([t.text if not t.ent_type_ else t.ent_type_ for t in doc])
# ['His', 'friend', 'PERSON', 'is', 'here', 'with', 'PERSON', 'and', 'PERSON', '.']

print(" ".join([t.text if not t.ent_type_ else t.ent_type_ for t in doc]) )
# His friend PERSON is here with PERSON and PERSON .

You can check the documentation on Spacy here. It uses the built in Pipeline for the job and has good support for multiprocessing. I believe this is the officially supported way to replace entities by their tags.

🌐
GitHub
github.com › explosion › spaCy › issues › 441
How to find what are the Named Entity a existing model contains? · Issue #441 · explosion/spaCy
July 1, 2016 - Hi, I want to find all existing NER Label in a model in Spacy. Can anyone tell, how to find that. Thank you
Published   Jul 01, 2016
🌐
Python Humanities
ner.pythonhumanities.com › 02_01_spaCy_Entity_Ruler.html
3. Using SpaCy’s EntityRuler — Introduction to Named Entity Recognition
Notice now that our EntityRuler is functioning before the “ner” pipe and is, therefore, prefinding entities and labeling them before the NER gets to them. Because it comes earlier in the pipeline, its metadata holds primacy over the later “ner” pipe. In some instances, labels may have a set type of variance that follow a distinct pattern or sets of patterns. One such example (included in the spaCy documentation) is phone numbers.
🌐
GitHub
github.com › explosion › spaCy › discussions › 5176
How to change the label of an Entity? · explosion/spaCy · Discussion #5176
ents = list(doc.ents) old_ent = ents[0] new_ent = Span(doc, old_ent.start, old_ent.end, label="ORG") ents[0] = new_ent doc.ents = ents
Author   explosion
🌐
spaCy
spacy.io › usage › rule-based-matching
Rule-based matching · spaCy Usage Documentation
When the nlp object is called on a text, it will find matches in the doc and add them as entities to the doc.ents, using the specified pattern label as the entity label. If any matches were to overlap, the pattern matching most tokens takes priority. If they also happen to be equally long, then the match occurring first in the Doc is chosen. ... The entity ruler is designed to integrate with spaCy’s existing pipeline components and enhance the named entity recognizer.