NER with small strongly labeled and large weakly labeled data

Small strongly labeled and large weakly labeled data is a very common situation we may run into in NLP or ASR modeling. Amazon search team used this three-stage NEEDLE Framework to take advantage of large weakly labeled data to improve NER. Their noise-aware loss function is interesting and worth taking a deep dive into. Paper link:

To Review Deep Learning

I will go back to work on deep learning after writing bash for 6 months. Here is my plan to pick up deep learning.


  • Deep learning (Andrew Ng):
  • Book (Part 2):

CV or NLP:

  • Convolutional Neural Networks for Visual Recognition (Spring 2017):
  • CS224N: Natural Language Processing with Deep Learning | Winter 2019:

PyTorch and Tensorflow:

  • Fast AI (PyTorch):
  • Tensorflow: