Deep Learning for Anomaly Detection: A Survey

Anomaly detection is an important problem that has been well-studied within diverse research areas and application domains. The aim of this survey is two-fold, firstly we present a structured and comprehensive overview of research methods in deep learning-based anomaly detection. Furthermore, we review the adoption of these methods for anomaly across various application domains and assess their effectiveness. We have grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted. Within each category we outline the basic anomaly detection technique, along with its variants and present key assumptions, to differentiate between normal and anomalous behavior. For each category, we present we also present the advantages and limitations and discuss the computational complexity of the techniques in real application domains. Finally, we outline open issues in research and challenges faced while adopting these techniques.

https://arxiv.org/abs/1901.03407

Do CIFAR-10 Classifiers Generalize to CIFAR-10?

https://arxiv.org/abs/1806.00451

Machine learning is currently dominated by largely experimental work focused on improvements in a few key tasks. However, the impressive accuracy numbers of the best performing models are questionable because the same test sets have been used to select these models for multiple years now. To understand the danger of overfitting, we measure the accuracy of CIFAR-10 classifiers by creating a new test set of truly unseen images. Although we ensure that the new test set is as close to the original data distribution as possible, we find a large drop in accuracy (4% to 10%) for a broad range of deep learning models. Yet more recent models with higher original accuracy show a smaller drop and better overall performance, indicating that this drop is likely not due to overfitting based on adaptivity. Instead, we view our results as evidence that current accuracy numbers are brittle and susceptible to even minute natural variations in the data distribution.

Semantics vs. Syntax

Syntax is the grammar. It describes the way to construct a correct sentence. For example, this water is triangular is syntactically correct.
Semantics relates to the meaning. this water is triangular does not mean anything, though the grammar is ok.


https://stackoverflow.com/questions/209979/are-semantics-and-syntax-the-same

BERT for text classification

BERT can achieve high accuracy with small sample size (e.g. 1000): https://github.com/Socialbird-AILab/BERT-Classification-Tutorial/blob/master/pictures/Results.png

To simply get features (embedding) from BERT, this Keras package is easy to start with: https://pypi.org/project/keras-bert/

For fine-tuning with GPUs, this PyTorch version is handy (gradient accumulation is implemented): https://github.com/huggingface/pytorch-pretrained-BERT

For people how have access to TPUs: https://github.com/google-research/bert