extraction of named entity mentions in unstructured text into pre-defined categories
Videos
What are the approaches to NER?
What is Named Entity Recognition (NER)?
Hello, fellow Redditors!
I'm looking to build an entity recognition model for my company's internal use, and I could use some guidance from the community. Essentially, I want to develop a model that can automatically extract specific entities like UID, email ID, and login ID from various types of text data, such as emails, logs, and messages.
uid -> a string of 5-15 digit set of characters from a-z caps and small, with a "." example"ramzi" emailid -> example "ramzees@gmailcom"
loginid-> 5 or 4 digit A-Z 0-9 "23542"
Specifically, I need some help on:
Data Collection: What kind of data do I need to collect for training the model? How should this data be annotated?
Feature Extraction: What features should I extract from the text data to train the model effectively? Are there any best practices for feature engineering in entity recognition tasks?
Model Training: How do I train the model using the annotated data and extracted features? Which machine learning algorithms or models are suitable for entity recognition tasks?
Evaluation: What metrics should I use to evaluate the performance of my model? How do I know if it's performing well enough?
I would greatly appreciate it if someone could provide detailed steps or point me to resources/tutorials that cover each of these aspects. Any advice, tips, or best practices would be invaluable.