BERT: Pre-training of Deep Bidirectional Transformers (Devlin et al 2018) Draft notes on BERT: Pre-training of Deep Bidirectional Transformers (Devli. draft certainty: unlikely importance: 5 Overview Key Ideas Further Reading Draft — not yet written. Overview Key Ideas Further Reading